For Loops, Logic Statements, and Functions#
For loops to perform the same action on the items in a sequence#
Working with sequences#
To execute a (set of) statement(s) once for each item in a sequence (e.g. a string, list, or tuple), we use a for loop. The syntax for a for loop is:
for variable in sequence_name:
statement(s)
The variable accesses each item of the sequence on each iteration. After the sequence name, we find a colon (:). The statement(s) is (are) indented using whitespace. The loop continues until we reach the last item in the sequence.
Use sequence_name.append(new_item)
to save the outputs of the statement(s) in a new sequence: this function appends a new item to the end of an existing sequence. Make sure to first create the new sequence outside of the loop!
Working with dictionaries#
To iterate over a dictionary, use:
for key, value in dictionary_name.items():
statement(s)
Of note, the items()
method returns an object with key-value pairs of the dictionary, as tuples in a list.
Functions#
A function is a block of code which only runs when it is called. The function needs parameters to run. These are specified after the function name. The syntax for a function is:
def function_name(parameter(s)):
"""
documentation
"""
block_of_code
return value_to_return.
We first define the function name and parameters using
def
.The optional documentation section, between “”” and “””, contains information about what the function does, including the parameters and what is returned.
The code of the function.
Use the return statement to let a function return its result. It is possible return more than one variable from a function. These will be returned as a tuple of variables, which may require unpacking as appropriate.
After creating a function in Python we can call it by using the name of the function followed by parenthesis containing parameters of that particular function.
Logic statements to make choices#
Use if
, elif
, and / or else
statements to evaluate a variable and do something if the variable has a particular value.
Operations include:
equal to
==
not equal to
!=
greater than
>
less than
<
greater than or equal to
>=
less than or equal to
<=
Use and
, or
, and not
to check more than one condition.
Examples#
Please pay attention to the use of comments (with #
) to express the units of variables or to describe the meaning of commands.
Example
Calculate the \(pK_{b}\) values of aspartate using its \(pK_{a}\) values (2.10, 3.86, 9.82) and save them in a new list.
pKa_asp = [2.10, 3.86, 9.82] #create a list with floats
pKb_asp = [] #create a new, empty list
for pKa in pKa_asp: #select each item from the existing list
pKb = 14 - pKa #calculate for each item from the existing list the pKb
pKb_asp.append(pKb) #append the pKb calculated to the newly created list
print(pKb_asp) #print the list that we calculated
[11.9, 10.14, 4.18]
Example
Create and test a function that calculates the concentration of a solution using the Beer-Lambert law. Parameters include the molar extinction coefficient, absorbance, and cuvette pathlength.
def beer_lambert(epsilon, absorbance, pathlength): #create the function
"""
Calculate the concentration of a solution, using the Beer-Lambert law.
Args:
epsilon (float) in L/(mol cm)
absorbance (float) in AU
pathlength (float) in cm
Returns:
concentration of the solution (float) in (mol/L)
"""
concentration = absorbance / (epsilon * pathlength)
return concentration
beer_lambert(21000, 0.89, 1) #test the function with random but reasonable parameters
4.238095238095238e-05
Example
Create and test a function that gives the charge of a protein. Parameters include the pH of the solution and the isoelectric point of the protein. ``
def prot_charge(pH, pI): #create the function
if pH > pI: #if greater then
print("pH is above pI, the protein is negatively charged") #print the outcome
elif pH == pI: #if equal to
print("pH and pI are equal, the protein has no net charge") #print the outcome
else: #if smaller then
print("pH is below pI, the protein is positively charged") #print the outcome
prot_charge(5.4, 7.5) #test the function with random but reasonable parameters
pH is below pI, the protein is positively charged
Exercises#
Exercise
Count the number of restriction enzyme sites for EcoRI, BamHI, EarI, ScaI, NotI, TaqI, FokI, and HindIII in the following DNA sequence.
Tip: Use the dictionary from earlier!
DNAseq = "GTAAAACGACGGCCAGTGAATTCGAGCTCGGTACCCGGGGATCCTCTAGAGTCGACCTGCAGGCATGCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTG" #create a string using double quotation marks
Solution
Here’s one possible solution.
REs = {
'EcoRI' : 'GAATTC',
'BamHI' : 'GGATCC',
'EarI' : 'CTCTTC',
'ScaI' : 'AGTACT',
'NotI' : 'GCGGCCGC',
'TaqI' : 'TCGA',
'FokI' : 'GGATG',
'HindIII' : 'AAGCTT'
} #Create a dictionary with restriction enzyme names (keys) and cleavage sites (values). Both are strings.
for name, site in REs.items(): #select for each item from the existing dictionary key and value
count = DNAseq.count(site) #count the number of times the site appears in the string
print("For", name, ", there are", count, "sites .") #Print and place all values that we calculated in a readable sentence. The print function can take more than one object. By default, it separates objects with a space.
Exercise
Create and test a function that accepts a DNA sequence and outputs the complementary and reverse complementary strand.
Tip: To complement a DNA sequence, one needs to replace A with T, T with A, C with G and G with C. The reverse complement is the reverse of the complement sequence (i.e. the first base becomes the last). For example, for the strand 5’-AAGCCGA-3’, the complementary strand is 3’-TTCGGCT-5’ and the reverse complementary strand is 5’-TCGGCTT-3’.
Solution
Here are some possible solutions.
OPTION 1: Using logic statements
def comp_revcomp(DNAstring): #create the function
Comp = "" #create a new, empty string
for base in DNAstring: #select each item from the DNAstring
if base == "T": #if the base is T
Comp = Comp + "A" #replace it by A and add to the new string
elif base == "A": #if the base is T
Comp = Comp + "T" #replace it by T and add to the new string
elif base == "C": #if the base is C
Comp = Comp + "G" #replace it by G and add to the new string
elif base == "G": #if the base is G
Comp = Comp + "C" #replace it by C and add to the new string
else: #if the base is not T, A, C, or G
Comp = Comp + "N" #replace it by N and add to the new string
revComp = Comp[::-1] #reverse the new string: start at the end of the string and end at position 0, move with step -1, which is one step backwards
print("For 5'-", DNAstring, "-3', the complement is 3'-", Comp, "-5' .", "The reverse complement is 5'-", revComp, "-3' .") #Print and place all values that we calculated in a readable sentence. The print function can take more than one object. By default, it separates objects with a space.
comp_revcomp("AAGCCGA") #test the function with random but reasonable parameters
OPTION 2: Using a dictionary
def comp_revcomp2(DNAstring): #create the function
BaseComp = {
'A' : 'T',
'T' : 'A',
'C' : 'G',
'G' : 'C'
} #Create a dictionary with complementary base pairs.
comp = "".join(BaseComp.get(x) for x in DNAstring) #Loop through the bases of the DNAstring and write the complementary base pair from the dictionary. Convert the generated list to a string using the join function, see https://www.programiz.com/python-programming/methods/string/join.
revcomp = comp[::-1] #reverse the new string: start at the end of the string and end at position 0, move with step -1, which is one step backwards
print("For 5'-", DNAstring, "-3', the complement is 3'-", comp, "-5' .", "The reverse complement is 5'-", revcomp, "-3' .") #Print and place all values that we calculated in a readable sentence. The print function can take more than one object. By default, it separates objects with a space.
comp_revcomp2("AAGCCGA") #test the function with random but reasonable parameters
OPTION 3: With maketrans() and translate() functions, see https://www.geeksforgeeks.org/python-maketrans-translate-functions/.
def comp_revcomp3(DNAstring): #create the function
mydictcomp = str.maketrans("ATCG","TAGC") #use maketrans() to construct the translate table: A (from string 1) becomes T (from string 2), T becomes A ...
comp = DNAstring.translate(mydictcomp) #use translate() to perform the translations
revcomp = comp[::-1] #reverse the new string: start at the end of the string and end at position 0, move with step -1, which is one step backwards
print("For 5'-", DNAstring, "-3', the complement is 3'-", comp, "-5' .", "The reverse complement is 5'-", revcomp, "-3' .") #Print and place all values that we calculated in a readable sentence. The print function can take more than one object. By default, it separates objects with a space.
comp_revcomp3("AAGCCGA") #test the function with random but reasonable parameters
OPTION 4: Use Biopython, a set of freely available tools for biological computation written in Python by an international team of developers. Available via https://biopython.org/.
Exercise
Create and test a function that gives the molecular weight of a peptide from the one-letter amino acid code and the amino acid monoisotopic masses.
MonoIsoMassAA = {'A': 71.04,
'C': 103.01,
'D': 115.03,
'E': 129.04,
'F': 147.07,
'G': 57.02,
'H': 137.06,
'I': 113.08,
'K': 128.09,
'L': 113.08,
'M': 131.04,
'N': 114.04,
'P': 97.05,
'Q': 128.06,
'R': 156.10,
'S': 87.03,
'T': 101.05,
'V': 99.07,
'W': 186.08,
'Y': 163.06
} #Create a dictionary with amino acid name (keys) and monoisopic mass (values).
Solution
Here are some possible solutions.
OPTION 1:
def protmass(ProteinString): #create the function
weight = 0 #start with 0 g/mol
for code, mass in MonoIsoMassAA.items(): #select for each item from the existing dictionary key and value
count = ProteinString.count(code) #count the number of times the amino acid appears in the ProteinString
weight = weight + count * mass #Multiply the number of times this amino acid appears with its isotopic mass. Add this number to the current weight.
return weight + 18.01 #return the weight + 18.01, the mass of water
protmass("AVATAR") #test the function with random but reasonable parameters
OPTION 2:
def protmass2(ProteinString): #create the function
weight = sum(MonoIsoMassAA.get(x) for x in ProteinString) #Loop through the amino acids of the Proteinstring and get their corresponding monoisotopic mass. These are summed.
return weight + 18.01 #return the weight + 18.01, the mass of water
protmass2("AVATAR") #test the function with random but reasonable parameters
OPTION 3:
def protmass3(ProteinString): #create the function
weight = sum(map(MonoIsoMassAA.get, ProteinString)) #The map() function executes a specified function - in this case monoisotopicMassAA.get - for each amino acid of the Proteinstring. It returns the monoisotopic masses from the dictionary, which are summed.
return weight + 18.01 #return the weight + 18.01, the mass of water
protmass3("AVATAR") #test the function with random but reasonable parameters
OPTION 4: Use Biopython, a set of freely available tools for biological computation written in Python by an international team of developers. Available via https://biopython.org/.