python - Is it possible to specify the order of levels in Pandas factorize method? -


i using pandas factorize array consisting of 2 types of strings. want make sure 1 of strings "xyz" coded 0 , other string "abc" coded 1.

is possible this? looked documentation , didn't find useful?

this purpose of categorical, namely (optionally) specify actual categories when factorizing (as specify ordering if needed). ordering of categories determine factorization ordering. if unspecified, order of appearance order of categories.

this requires 0.16.0 ability specify categories directly in .astype; categoricals introduced in 0.15.0

in [10]: s = series(list('aaabbaa')).astype('category',categories=list('ab'))  in [11]: s.cat.codes out[11]:  0    0 1    0 2    0 3    1 4    1 5    0 6    0 dtype: int8 

since 'b','a' categories, codes opposite of above.

in [12]: s = series(list('aaabbaa')).astype('category',categories=list('ba'))  in [13]: s.cat.codes out[13]:  0    1 1    1 2    1 3    0 4    0 5    1 6    1 dtype: int8 

Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

node.js - Custom Model Validator SailsJS -

php - Find a regex to take part of Email -