Python is truncating my file contents -
i have set task in python code long text file 1-26 letters of alphabet , 26+ non-alphanumerics see code below:
#open file,read contents , print out my_file = open("timemachine.txt") my_text = my_file.read() print (my_text) print "" print "" #open file , read each line, taking out eol chars open("timemachine.txt","r") myfile: clean_text = "".join(line.rstrip() line in myfile) #close file prevent memory hogging my_file.close() #print out result in lower case clean_text_lower = clean_text.lower() print clean_text_lower print "" print "" #establish lowercase alphabet list my_alphabet_list = [] my_alphabet = """ abcdefghijklmnopqrstuvwxyz.,;:-_?!'"()[] %/1234567890"""+"\n"+"\xef"+"\xbb"+"\xbf" #find index each lowercase letter or non-alphanumeric letter in my_alphabet: my_alphabet_list.append(letter) print my_alphabet_list, print my_alphabet_list.index print "" print "" #go through text , find corresponding letter of alphabet letter in clean_text_lower: posn = my_alphabet_list.index(letter) print posn, when print should (1) original text, (2) text reduced lower case , no whitespace, (3) code index used , (4) converted codes. can latter part of original text or if comment out (4) print text. why?
the bit @ end:
for letter in clean_text_lower: posn = my_alphabet_list.index(letter) print posn, keeps reassigning posn without doing it. therefore, my_alphabet_list.index(letter) last letter in clean_text_lower.
to fix there's couple things do. first thing springs mind initialize list , append values i.e:
posns = [] letter in clean_text_lower: posns.append(my_alphabet_list.index(letter)) print posns,
Comments
Post a Comment