Course Level
CS1
Knowledge Unit
Fundamental Programming Concepts
Collection Item Type
Other Material Type
Synopsis

Given an increased focus on computer science education as a valuable context to teach data science—due in part to the potential of computing for accessing, processing, and analyzing digital datasets—there have been steady efforts to develop kindergarten through 12th grade (K-12) curricula that productively engage learners in these academic areas. Bootstrap: Data Science and Exploring Computer Science (ECS) are prominent curricular examples designed to support high school data science access in computing contexts. While these vital efforts have found success bridging computer and data science, there remain growing concerns about how we can ensure that such learning experiences support the demographic and intellectually diverse cohorts of students needed for field innovation, occupational attainment, and public literacy. Challenges to these efforts often persist because existing data sources and activities offered to students are typically shaped by others (e.g., curriculum designers, teachers, etc.) rather than by learners themselves. This results in inquiry-driven questions, processes, and outcomes that can restrict exploration and engagement, as opposed to inherently and authentically linking to learners’ diverse personal interests, styles and concerns. Perspectives in culturally responsive computing (CRC) provide viable frames for how to design learning experiences that encourage learner access, empowerment, and personal interests—key features for spurring field diversity through learning. With this imperative and framing in mind, we share our project called “Coding Like a Data Miner” (CLDM), which leverages a social media-based application programming interface (API) to teach learners how to gather, process (or wrangle), analyze and then communicate insights learned from “big data” sets. We describe this design as sandbox data science (SDS)—an approach to computing-based data science that is consistent with CRC perspectives with demonstrated promise in broadening participation and enhancing productivity in computer science education. In this article, we share insights into our rationale and the theoretical perspectives that drive our curricular design. We then provide an overview of the curriculum with case examples of the sorts of pursuits that can be taken up by learners in this context. Finally, we reflect on CLDM and design principles that make SDS a viable approach to broadening computing-based data science participation and productivity. This curriculum and accompanying resources are publicly available for review, use and adaptation at www.abclearninglab.com/cldm.

ACM Digital Library Entry

Recommendations

Leverage for culturally responsive computing-based data science teaching and learning. (See paper)

Engagement Highlights

In this era of ongoing and exponential technical advancement, digital data permeates most aspects of daily life. From smart watches that gather and track our personal daily health activity data, to social media platforms that leverage our browsing histories to inform algorithms about userperspectives, decision-making and behaviors, the collection and analysis of digital data has revolutionized how society functions. These developments are a part of a broader field defined more recently as "data science" [1] that explores data collection and analysis techniques at scales not possible decades before. Concomitant with these developments have been calls from teachers, researchers, and policymakers for education initiatives that will not only address a growing demand for data science professionals, but also prepare future innovators, data scientists, and informed citizens to advance the field and steward its impact [2, 3, 4]. The response has been a series of curricular design efforts intended to bring data science to existing academic disciplines (e.g., mathematics, physics, engineering, etc.) to both advance priorities within those fields and better understand the interdisciplinary nature of learning in these areas [5, 6]. Examples include activities where learners engage with a math or physics-based dataset, and then use data science techniques to gain insights into features or phenomena in that context. While promising, these existing efforts often fail to engage with one key feature of contemporary data science due to its technical complexity: “big data” [7, 8], or the massive data sources that are typically generated through automated processes. One promising solution to this issue lies in the integration of data science and computer science education, with its emphasis on computational thinking [9]. Many of the actions needed to engage productively with data science, such as planning and enacting data collection, processing data into analyzable form, and then under-standing and communicating data sets, can be enacted with relative ease through the application of computer programming. Features of computational thinking and practice such as pattern recognition, decomposition, abstraction, and algorithm design also hold value for conceptualizing data science processes, especially at scale [10].

One challenge with using computer science education as a context for teaching data science involves the creation and curation of learning experiences that promote diversity and inclusion, which are essential for the intellectual diversity that inspires field innovation. In practice, teachers may struggle to accommodate a wide variety of student interests, cultural histories, and geopolitical viewpoints, especially when using existing curricular models. Several earlier studies in computer science education, for example, have provided pre-curated data sources on which students can apply data science and computer programming techniques. While not free of value, this approach ultimately restricts the scope of student engagement, and the kinds of questions learners are able to explore. Due to the lack of learner involvement in data collection and analysis, personalization and relevance in their educational experiences are limited at best. In data science, these issues are further complicated by the fact that learners are often left out when creating datasets, choosing analytical approaches, and engaging with lines of inquiry, thus disconnecting learning experiences from learners and their diverse epistemological (e.g., inquiry) styles. The result is a body of evidence that endorses practice using data sources managed by others rather than by students themselves, restricting the scope and questions in education research and practice. In sum, while computer science education offers several key affordances and opportunities, there remains a set of accompanying equity issues that persist in computing education and have for many decades [11]. In fact, the literature is replete with evidence suggesting that instructional designs that do not account for learners’ diverse social and cultural experiences can have adverse and lasting impacts on learning outcomes [12, 13], including choices regarding field participation [14]. This has been shown to contribute to field attrition and severe underrepresentation among learners who are traditionally at risk of marginalization in these area [15, 16].

In many ways, data science is currently positioned at a critical juncture for research and practice due to new curricular design implementations that circumvent some of these persistent issues in computing. One potential solution lies in the application of best practices observed in other areas of computing education: theories in culturally relevant computing (CRC). CRC theories have precedence as foundational starting points for the intentional design of learning experiences that support learners’ diverse cultural histories, personal interests, and social and political concerns [17, 18]. To address the issues of access and engagement in CS teaching and learning, the use of relevancy as a guiding design concept has been suggested by Ladson-Billings [17] as a crucial component of effective education, and one that is closely tied to issues of equity and social justice. By valuing and centering students’ personal, social, and cultural knowledge and experiences into the learning process, educators can create more meaningful and impactful educational experiences that promote both academic achievement and positive social outcomes. For underrepresented students, educators can in-crease engagement and learning outcomes by creating curricula that fundamentally link to learners’ individual interests, cultural backgrounds, and sociopolitical environment.

In our work, we apply the idea of relevancy through what we call sandbox data science (SDS). SDS is enabled through freely accessible Application Processing Interfaces (APIs) that can be used to gather or “scrape” data from websites or online sources using automated tools or scripts. For us, SDS has emerged as a promising solution to equip learners with tools to conduct their own explorations, addressing the issue of limited personal engagement in pre-college data science curricula. In this way, social media platforms can serve as a massive, diverse, and flexible library of data that learners can use to explore a wide range of inquiries and construct new knowledge. Similar to the constructionist perspectives and open-ended activities in sandbox science using Scratch [19, 20] and electronic textiles, or E-textiles [21] an emphasis on relevancy in curricular design allows for the varied pursuits and problem-solving challenges that can spur computational thinking with diverse learners [9].

In the paper, we describe how relevancy as a design principle informed the development of key activities in our curriculum (activity design). We then provide an illustrative case example of how this approach might be enacted in the classroom and conclude with reflections on what this might mean for computing-based and culturally relevant data science teaching and learning.

(See the paper for references.)

Engagement Practices Employed

Materials and Links

Computer Science Details

Programming Language
Python

Material Format and Licensing Information

Creative Commons License
CC BY