python - Is it possible to specify the order of levels in Pandas factorize method? -
i using pandas factorize array consisting of 2 types of strings. want make sure 1 of strings "xyz" coded 0 , other string "abc" coded 1.
is possible this? looked documentation , didn't find useful?
this purpose of categorical
, namely (optionally) specify actual categories when factorizing (as specify ordering if needed). ordering of categories determine factorization ordering. if unspecified, order of appearance order of categories.
this requires 0.16.0 ability specify categories directly in .astype
; categoricals
introduced in 0.15.0
in [10]: s = series(list('aaabbaa')).astype('category',categories=list('ab')) in [11]: s.cat.codes out[11]: 0 0 1 0 2 0 3 1 4 1 5 0 6 0 dtype: int8
since 'b','a' categories, codes opposite of above.
in [12]: s = series(list('aaabbaa')).astype('category',categories=list('ba')) in [13]: s.cat.codes out[13]: 0 1 1 1 2 1 3 0 4 0 5 1 6 1 dtype: int8
Comments
Post a Comment