python - selecting a column value based on a different column value after applying Groupby -
i able work, not after apply groupby. in example want have last column contain lowest value column x. have popuymated df column called yminx abc column like. can't value of abc local (after groupby) min.
in [3]: df out[3]: symbol x y yminx 0 ibm 12 27 58 1 ibm 1 58 58 2 ibm 13 39 58 3 ibm 4 45 58 4 gs 5 72 44 5 gs 15 54 44 6 gs 20 50 44 7 gs 4 90 44 8 gs 14 39 44 9 gs 2 44 44 10 gs 7 79 44 11 gs 12 27 44 12 gs 11 66 44 df['try']=df.groupby(['symbol'])['x'].transform('min') df['cond1'] = df['x'] == min(df['x']) df['abc']= np.select(df['cond1'],df['y']) symbol x y yminx cond1 abc try 0 ibm 12 27 58 false 58 1 1 ibm 1 58 58 true 58 1 2 ibm 13 39 58 false 58 1 3 ibm 4 45 58 false 58 1 4 gs 5 72 90 false 58 2 5 gs 15 54 90 false 58 2 6 gs 20 50 90 false 58 2 7 gs 4 90 90 false 58 2 8 gs 14 39 90 false 58 2 9 gs 2 44 90 false 58 2 10 gs 7 79 90 false 58 2 11 gs 12 27 90 false 58 2 12 gs 11 66 90 false 58 2
in output see 58 being selected nin ibm when gs same min carried on if groupby never referenced
i sure it's syntax thing stuck.
thanks help
john
one way work indices of minimum values. example:
>>> imin = df.groupby("symbol")["x"].transform("idxmin") >>> df["yminx"] = df.loc[imin, "y"].values >>> df symbol x y yminx 0 ibm 12 27 58 1 ibm 1 58 58 2 ibm 13 39 58 3 ibm 4 45 58 4 gs 5 72 44 5 gs 15 54 44 6 gs 20 50 44 7 gs 4 90 44 8 gs 14 39 44 9 gs 2 44 44 10 gs 7 79 44 11 gs 12 27 44 12 gs 11 66 44
(the values
needed because result of df.loc
has own index, , want ignore , care values instead.)
Comments
Post a Comment