This is the first of five programming assignments in a semester-long CS-1-like course named DNA to introduce students to programming within the context of genomics. This assignment requests a Python program to perform an introductory analysis of a "snip" of DNA that includes some upstream (intergenic) sequence and the beginning (but not all) of a gene (genic sequence). A "Starter Kit" includes a template of a Python source file that shows, by example, good introductory and inline documentation, the use of good (camelCase) variable names, and a healthy dose of print statements that produce meaningful and neat output.
Explore alternate solutions for obtaining a reverse complement sequence, in particular, the str.maketrans() and translate() utilities. Regularly allocate time in your class sessions to bring in colleagues, in particular, a biologist who can talk (briefly) of the beauty of DNA. Use a "flipped classroom" where students watch lectures before/after class in order to maximize the amount of hands-on Python play. For example, (a) require students to watch lectures on biological topics (cf. Udacity's 'Tales of the Genome' MOOC) outside of class; and (b) leverage programming practice sites for student practice outside of class (cf. codecademy's Python course).
We live in a post-genomic world where strings of sequenced DNA are the starting point for discovery from basic research to personalized medicine. In addition to interdisciplinary connections with the life sciences, programming concepts include string indexing, finding substrings, slicing pieces of strings and storing partial solutions, flipping between cases of strings, joining and reversing strings, counting characters within a string, and translating nucleotides between strands of DNA.