-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathpython_script_8.py
More file actions
executable file
·39 lines (26 loc) · 1.15 KB
/
Copy pathpython_script_8.py
File metadata and controls
executable file
·39 lines (26 loc) · 1.15 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#! /usr/bin/env python3
import sys
import re
##Python 8 problem sets
#1.0 Take a mulit-FASTA Python_08.fasta file from user input and calculate the nucleotide composition for each sequence. Use a datastructure to keep count. Print out each sequence name and its compostion in this format
# need to go through the FASTA file line at a time, using the first line ID (header)
# as the dictionary key for nucleotide value. In turn the nucleotide value also functions as
#the dictionary name for the nucleotide content subdictionary. Each A T G C within the nucleotide
#content dictionary functions as a separate key, to which the value is the occurrence of each
#nucleotide within the sequence.
#fasta[gene_name][nt] = {A:, # T:, # G: #, C: #}
#construct empty dictionary dataframe
#populate each dictionary with proper key-values
#count nucleotides and add into lowest dictionary level
fasta = {} #highest level dictionary
for line in sys.argv[1]:
line = line.split()
if r"(^>.*\s)" not in line:
group(1)= fasta[]
print(fasta)
#for line in sys.argv[1]:
# line = line.split()
# if r"(^>.*\s)" in line:
# line = fasta[line]
#
#print(fasta)