HIM3002 Computer Programming
Tung Wah College HIM3002: Computer Programming for Healthcare Individual assignment: Finding Patterns in Sequence
Background In Lecture 6, we learned the programming techniques in finding patterns in biological sequence,specifically finding fixed pattern and flexible pattern. In this tutorial exercise, you will havefurther practices in finding patterns in sequences. Name your file “T06_PatternAnalyser.py”.
Task 1
Write a function that takes a sequence and代写HIM3002 Computer Programming an integer k as inputs. It returns True if the inputsequence has repeated sub-sequence of size k, and False otherwise. Your function should besimilar to that below.
By making use of the function “re_all_match”, find the following patterns in the sequence.
- a) A DNA pattern with four symbols, with “A” and “T” at the first and the last symbol,e.g., “AGGT”, “ACTT”.
- b) A DNA pattern with at least two symbols, with “A” and “T” at the first and the lastsymbol, e.g., “AT”, “ACT”, “AGGGT”.
- c) A DNA pattern with at least three symbols, with symbols “A” and “T” at the beginningand at the end, and any symbols except “C” in between, e.g., “AGAT”, “AAGT”,“AATGT” but not “ACGT”, assuming only “A”, “G”, “C” and “T” are in the sequence.
- d) A protein pattern with 10 and 15 symbols, with “M” at the beginning and “D” at theFor example,
- a) Sequence: AGGTAGTTTGACGTTACTGFound pattern: AGGT located at 0Found pattern: AGTT located at 4Found pattern: ACGT located at 10b) Sequence: AGGTGCAAGTGACGAACAAGFound pattern: AGGTGCAAGT located at 0Found pattern: AAGT located at 6Found pattern: AGT located at 7
- c) Sequence: AGGTGCAAGTGACGAACAAGFound pattern: AGGT located at 0Found pattern: AAGT located at 6Found pattern: AGT located at 7
- d) Sequence:CDEMECMEDDFEMECMEDDFEMECMEDDFEMECMEDDFEGHIEJMCEE