python - Determing relative letter frequency -
i need create function takes text file input , returns vector of size 26 frequency in percent of each character (a z). must insensitive case. other letters (ex. å) , symbols should ignored.
i've tried use of answers here, answer 'jacob'. determining letter frequency of cipher text
this code far:
def letterfrequency(filename): #f: text file converted lowercase f=filename.lower() #n: sum of letters in text file n=float(len(f)) import collections dic=collections.defaultdict(int) #the absolute frequencies x in f: dic[x]+=1 #the relative frequencies string import ascii_lowercase x in ascii_lowercase: return x,(dic[x]/n)*100
for example, if try this:
print(letterfrequency('i have no idea')) >>> ('a',14.285714)
why not print relative values of letters? letters not in string, z in example?
and how make code print vector of size 26?
edit: have tried using counter, prints ('a':14.2857) , letters in mixed order. need relative frequency of letters in ordered sequence!
for x in ascii_lowercase: return x,(dic[x]/n)*100
the function returned in first iteration of loop.
instead, change return list of tuples:
letters = [] x in ascii_lowercase: letters.append((x,(dic[x]/n)*100)) return letters
Comments
Post a Comment