Perspectives and Practices in Teaching Introductory Data Science
Some initiatives such as
“I think introduction to data science is a pretty difficult course to teach… So, the other [intro course that I taught] was quite stable. But from my current experience with data science, stable will not be the word that I will describe.” [Participant 14]
“I hope that what I’m doing here [teaching IDS course] is useful” [Participant 13]
I would say something I think about on this course a lot is… Is it necessary? Would students be better served by [other] classical intro data science course where they are doing [something else]. And I don’t know the answer. [Participant 09]
I know that other people have different configurations of introductory data science, and that includes … some aspects of statistics. So, I am curious about ultimately what you [this research] find out about that, and what is the right configuration? [Participant 07]
We chose qualitative research design (Merriam, 2009).
Our aim was to understand
how IDS instructors interpret their teaching experiences in IDS courses
“what meaning they attribute to their experiences” (Merriam, 2009, p. 23).
We recruited participants…
who taught an IDS course at least twice at the undergraduate level.
whose course titles included Data Science and one of the following keywords: Introduction, Principles, Elements or Fundamentals
16 participants (2 pilot, 14 main study)
Gift card for their time.
Table 1: Institution type
Institution | n |
---|---|
Research University | 6 |
Liberal Arts College | 8 |
Table 2: Where IDS Courses Are Offered
Course Offering Unit | n |
---|---|
Mathematics | 3 |
Mathematics and Statistics | 2 |
Statistics | 2 |
Center/Institute | 2 |
Statistics and Data Science | 2 |
Mathematics and Computer Science | 1 |
Data Analytics | 1 |
Other | 1 |
Table 3: Prerequisite Courses of IDS Courses
Yes | No | Not Sure | |
---|---|---|---|
Prerequisite | 6 | 8 | 0 |
Prerequisite to follow-up | 11 | 2 | 1 |
Table 5: Class Sizes
Class Size | n |
---|---|
300+ | 2 |
200-299 | 1 |
100-199 | 2 |
… | |
30-39 | 2 |
20-29 | 3 |
10-19 | 3 |
1-9 | 1 |
Large IDS classrooms tend to :
IDS instructors teaching in small class sizes tend to provide more details on:
Their perception related to purposes, goals, and reasons for teaching data science reflect three orientations.
Enhance Data Literacy/ Teach to Learn from Data
Familiarize Students with a Programming Language
Attracting Students to Major/Minor in Data Science
Statistical Inference | 7 |
Ethics | 6 |
Introduction to Machine Learning | 4 |
Text Analysis | 4 |
Clustering | 3 |
Programming Language | n |
---|---|
R | 7 |
Python | 2 |
Both Python and R | 2 |
Both SQL and R | 2 |
No programming language | 1 |
Subject-Specific Teaching Strategies
Table 8: Most Commonly Used Teaching Strategies
Teaching Strategy | n |
---|---|
Lecturing | 9 |
Live Coding | 8 |
Questioning | 3 |
Group Work | 3 |
Think-Pair-Share | 3 |
Topic Specific Teaching Strategies
Only 3 IDS instructors mentioned topic-specific teaching strategies:
Storytelling while introducing real-world cases
Questioning while teaching data ethics
Role Playing while teaching how to join data sets
Tactile Simulation while teaching sampling distribution
Almost all IDS instructor changed their teaching over time. No pattern related to years of experience was observed.
Adding/removing a prerequisite
Refined content choices
Refined pedagogical choices
No change (2 participants)
No longer co-teaching (2 participants)
Students come from almost every major/department & every grade level.
Students without a programming background tend to spend more time learning coding.
Almost every IDS student experiences difficulties in developing solid strategic knowledge*.
Some Example Statements
If you can code quickly, you’re a good data scientist.
They’re either good or bad at data science (lack of growth mindset).
Data science is a field to be a collection of big hammers that if they learn how to swing each of these hammers, they could whack every problem in the world.
Potential Sources:
“They are frustrated that it’s not a simple task, very similar to math anxiety.” [Participant 04]
“At first, they don’t understand that programming is trial and error a lot.” [Participant 11]
“Their motivation decreases when they see error message.” [Participant 11 & 12]
“Sometimes they are frustrated when they failed to complete a task in R.” [Participant 15]
“Seeing end products in data science creates an outcome-centric perception bias [among students].”
[Participant 07]
How do IDS instructors balance teaching in different class sizes and diverse majors?
What is missing in large class sizes?
Lack of guidelines/student learning outcomes for IDS Courses
Despite high self-efficacy beliefs, the IDS instructors wondered how well their course aligns with the growing consensus on
“What should an introductory data science course be?”
We are still beta-testing what to teach and how to teach an IDS course.
More systematic studies in IDS education with empirical data are needed.
A policy-level document is required to guide us along the way.
Asamoah, D. A., Doran, D., & Schiller, S. (2020). Interdisciplinarity in data science pedagogy: a foundational design.Journal of Computer Information Systems,60(4), 370-377, https://doi.org/10.1080/08874417.2018.1496803
De Veaux, R. D., Agarwal, M., Averett, M., Baumer, B. S., Bray, A., Bressoud, T. C., … & Ye, P. (2017). Curriculum guidelines for undergraduate programs in data science.Annual Review of Statistics and Its Application,4, 15-30.
Demirci S., Dogucu, M. Zieffler A. & Rosenberg, J.M. (2023). Learning Difficulties of Introductory Data Science Students. In E. Jones (Ed) Proceedings of the International Association for Statistical Education Satellite Conference. https://escholarship.org/uc/item/01p3k7f3
Kelleher, J. D., & Tierney, B. (2018). Data science. MIT press.
Merriam, S. B. (2009). Qualitative Research: A Guide to Design andImplementation. San Francisco: CA: Jossey-Bass.
National Academies of Sciences, Engineering and Medicine Consensus Report (2018). Data Science for Undergraduates: Opportunities and Options.Washington,https://nas.edu/envisioningds.
Qian, Y., & Lehman, J. (2017). Students’ misconceptions and other difficulties in introductory programming: A literature review.ACM Transactions on Computing Education (TOCE),18(1), 1-24,https://doi.org/10.3102/0002831213477680
Schwab-McCoy, A., Baker, C. M., & Gasper, R. E. (2021). Data science in 2020: Computing, curricula, and challenges for the next 10 years. Journal of Statistics and Data Science Education, 29(sup1), S40-S50.
Yan, D., & Davis, G. E. (2019). A first course in data science. Journal of Statistics Education, 27(2), 99-109,https://doi.org/10.1080/10691898.2019.1623136
sinemdemirci.github.io/jsm-24