c++ - Tokenize sentence into words, considering special characters -


i have function receive sentence, , tokenize words, based on space " ". now, want improve function eliminate special characters, example:

i boy.   => {i, am, a, boy}, no period after "boy" said :"are ok?"  => {i, said, are, you, ok}, no question , quotation mark  

the original function here, how can improve it?

void tokenize(const string& str, vector<string>& tokens, const string& delimiters = " ") {      string::size_type lastpos = str.find_first_not_of(delimiters, 0);      string::size_type pos = str.find_first_of(delimiters, lastpos);      while (string::npos != pos || string::npos != lastpos)     {          tokens.push_back(str.substr(lastpos, pos - lastpos));          lastpos = str.find_first_not_of(delimiters, pos);          pos = str.find_first_of(delimiters, lastpos);     } } 


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -