Course Level
CS1
Knowledge Unit
Fundamental Programming Concepts
Collection Item Type
Project
Synopsis

In this project students work either individually or in pairs to write a text identification class that uses the Naive Bayes Classifier to determine what author is likely to have written a given text. Students are required to design a class that compiles features from a given text, such as word lengths, sentence lengths, and distribution of word-stems. They then use these features in the Naive Bayes algorithm to classify texts. In addition to reviewing string manipulation, this project requires students to design and document a class. It is particularly useful for students who need additional practice with strings and class design.

Recommendations

Integrate pair programming when it is appropriate for students to collaborate on a lab, assignment, or project.

Engagement Highlights

Uses text from a well-known author (J.K. Rowling) as a way of creating Meaningful and Relevant Context. Incorporates Student Choice by allowing students to create a word cloud for extra credit and encouraging students to indicate which features and texts to use for their program.

Materials and Links

Materials

Computer Science Details

Programming Language
Python

Material Format and Licensing Information

Creative Commons License
CC BY-NC