Check the similarity between two words with NLTK with Python -

January 15, 2010

i have 2 lists , want check similarity between each words in 2 list , find out maximum similarity.here code,

from nltk.corpus import wordnet  list1 = ['compare', 'require'] list2 = ['choose', 'copy', 'define', 'duplicate', 'find', 'how', 'identify', 'label', 'list', 'listen', 'locate', 'match', 'memorise', 'name', 'observe', 'omit', 'quote', 'read', 'recall', 'recite', 'recognise', 'record', 'relate', 'remember', 'repeat', 'reproduce', 'retell', 'select', 'show', 'spell', 'state', 'tell', 'trace', 'write'] list = []  word1 in list1:     word2 in list2:         wordfromlist1 = wordnet.synsets(word1)[0]         wordfromlist2 = wordnet.synsets(word2)[0]         s = wordfromlist1.wup_similarity(wordfromlist2)         list.append(s)  print(max(list))

but result error:

wordfromlist2 = wordnet.synsets(word2)[0]         indexerror: list index out of range

please me fix this.
thanking you

you're getting error if synset list empty, , try element @ (non-existent) index zero. why check zero'th element? if want check everything, try pairs of elements in returned synsets. can use itertools.product() save 2 for-loops:

from itertools import product sims = []  word1, word2 in product(list1, list2):     syns1 = wordnet.synsets(word1)     syns2 = wordnet.synsets(word2)     sense1, sense2 in product(syns1, syns2):         d = wordnet.wup_similarity(sense1, sense2)         sims.append((d, syns1, syns2))

this inefficient because same synsets looked again , again, closest logic of code. if have enough data make speed issue, can speed collecting synsets words in list1 , list2 once, , taking product of synsets.

>>> allsyns1 = set(ss word in list1 ss in wordnet.synsets(word)) >>> allsyns2 = set(ss word in list2 ss in wordnet.synsets(word)) >>> best = max((wordnet.wup_similarity(s1, s2) or 0, s1, s2) s1, s2 in          product(allsyns1, allsyns2)) >>> print(best) (0.9411764705882353, synset('command.v.02'), synset('order.v.01'))

Search This Blog

ANgular

Check the similarity between two words with NLTK with Python -

Comments

Post a Comment

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -