pandas_ml.core package

Submodules

class pandas_ml.core.frame.ModelFrame(data, target=None, *args, **kwargs)

Bases: pandas.core.frame.DataFrame, pandas_ml.core.generic.ModelPredictor

Data structure subclassing pandas.DataFrame to define a metadata to specify target (response variable) and data (explanatory variable / features).

Parameters:

data : same as pandas.DataFrame

target : str or array-like

Column name or values to be used as target

args : arguments passed to pandas.DataFrame

kwargs : keyword arguments passed to pandas.DataFrame

calibration

Property to access sklearn.calibration

cls

alias of GaussianProcessMethods

cluster

Property to access sklearn.cluster. See pandas_ml.skaccessors.cluster

covariance

Property to access sklearn.covariance. See pandas_ml.skaccessors.covariance

cross_decomposition

Property to access sklearn.cross_decomposition

cross_validation

Property to access sklearn.cross_validation. See pandas_ml.skaccessors.cross_validation

crv

Property to access sklearn.cross_validation. See pandas_ml.skaccessors.cross_validation

da

Property to access sklearn.discriminant_analysis

data

Return data (explanatory variable / features)

Returns:data : ModelFrame
decision_function(estimator, *args, **kwargs)

Call estimator’s decision_function method.

Parameters:

args : arguments passed to decision_function method

kwargs : keyword arguments passed to decision_function method

Returns:

returned : decisions

decomposition

Property to access sklearn.decomposition

discriminant_analysis

Property to access sklearn.discriminant_analysis

dummy

Property to access sklearn.dummy

ensemble

Property to access sklearn.ensemble. See pandas_ml.skaccessors.ensemble

feature_extraction

Property to access sklearn.feature_extraction. See pandas_ml.skaccessors.feature_extraction

feature_selection

Property to access sklearn.feature_selection. See pandas_ml.skaccessors.feature_selection

fit_predict(estimator, *args, **kwargs)

Call estimator’s fit_predict method.

Parameters:

args : arguments passed to fit_predict method

kwargs : keyword arguments passed to fit_predict method

Returns:

returned : predicted result

fit_sample(estimator, *args, **kwargs)

Call estimator’s fit_sample method.

Parameters:

args : arguments passed to fit_sample method

kwargs : keyword arguments passed to fit_sample method

Returns:

returned : sampling result

fit_transform(estimator, *args, **kwargs)

Call estimator’s fit_transform method.

Parameters:

args : arguments passed to fit_transform method

kwargs : keyword arguments passed to fit_transform method

Returns:

returned : transformed result

gaussian_process

Property to access sklearn.gaussian_process. See pandas_ml.skaccessors.gaussian_process

gp

Property to access sklearn.gaussian_process. See pandas_ml.skaccessors.gaussian_process

Property to access sklearn.grid_search. See pandas_ml.skaccessors.grid_search

groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False)

Group series using mapper (dict or key function, apply given function to group, return result as series) or by a series of columns.

Parameters:

by : mapping function / list of functions, dict, Series, or tuple /

list of column names. Called on each element of the object index to determine the groups. If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups

axis : int, default 0

level : int, level name, or sequence of such, default None

If the axis is a MultiIndex (hierarchical), group by a particular level or levels

as_index : boolean, default True

For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output

sort : boolean, default True

Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. groupby preserves the order of rows within each group.

group_keys : boolean, default True

When calling apply, add group keys to index to identify pieces

squeeze : boolean, default False

reduce the dimensionality of the return type if possible, otherwise return a consistent type

Returns:

GroupBy object

Examples

DataFrame results

>>> data.groupby(func, axis=0).mean()
>>> data.groupby(['col1', 'col2'])['col3'].mean()

DataFrame with hierarchical index

>>> data.groupby(['col1', 'col2']).mean()
has_data()

Return whether ModelFrame has data

Returns:has_data : bool
has_multi_targets()

Return whether ModelFrame has multiple target columns

Returns:has_multi_targets : bool
has_target()

Return whether ModelFrame has target

Returns:has_target : bool
imbalance

Property to access imblearn

inverse_transform(estimator, *args, **kwargs)

Call estimator’s inverse_transform method.

Parameters:

args : arguments passed to inverse_transform method

kwargs : keyword arguments passed to inverse_transform method

Returns:

returned : transformed result

isotonic

Property to access sklearn.isotonic. See pandas_ml.skaccessors.isotonic

kernel_approximation

Property to access sklearn.kernel_approximation

kernel_ridge

Property to access sklearn.kernel_ridge

lda

Property to access sklearn.lda

learning_curve

Property to access sklearn.learning_curve. See pandas_ml.skaccessors.learning_curve

linear_model

Property to access sklearn.linear_model. See pandas_ml.skaccessors.linear_model

lm

Property to access sklearn.linear_model. See pandas_ml.skaccessors.linear_model

manifold

Property to access sklearn.manifold. See pandas_ml.skaccessors.manifold

metrics

Property to access sklearn.metrics. See pandas_ml.skaccessors.metrics

mixture

Property to access sklearn.mixture

model_selection

Property to access sklearn.model_selection. See pandas_ml.skaccessors.model_selection

ms

Property to access sklearn.model_selection. See pandas_ml.skaccessors.model_selection

multiclass

Property to access sklearn.multiclass. See pandas_ml.skaccessors.multiclass

multioutput

Property to access sklearn.multioutput. See pandas_ml.skaccessors.multioutput

naive_bayes

Property to access sklearn.naive_bayes

neighbors

Property to access sklearn.neighbors. See pandas_ml.skaccessors.neighbors

neural_network

Property to access sklearn.neural_network

pipeline

Property to access sklearn.pipeline. See pandas_ml.skaccessors.pipeline

pp

Property to access sklearn.preprocessing. See pandas_ml.skaccessors.preprocessing

predict_log_proba(estimator, *args, **kwargs)

Call estimator’s predict_log_proba method.

Parameters:

args : arguments passed to predict_log_proba method

kwargs : keyword arguments passed to predict_log_proba method

Returns:

returned : probabilities

predict_proba(estimator, *args, **kwargs)

Call estimator’s predict_proba method.

Parameters:

args : arguments passed to predict_proba method

kwargs : keyword arguments passed to predict_proba method

Returns:

returned : probabilities

preprocessing

Property to access sklearn.preprocessing. See pandas_ml.skaccessors.preprocessing

qda

Property to access sklearn.qda

random_projection

Property to access sklearn.random_projection. See pandas_ml.skaccessors.random_projection

sample(estimator, *args, **kwargs)

Call estimator’s sample method.

Parameters:

args : arguments passed to sample method

kwargs : keyword arguments passed to sample method

Returns:

returned : sampling result

score(estimator, *args, **kwargs)

Call estimator’s score method.

Parameters:

args : arguments passed to score method

kwargs : keyword arguments passed to score method

Returns:

returned : score

seaborn

Property to access seaborn API

semi_supervised

Property to access sklearn.semi_supervised. See pandas_ml.skaccessors.semi_supervised

sns

Property to access seaborn API

svm

Property to access sklearn.svm. See pandas_ml.skaccessors.svm

target

Return target (response variable)

Returns:target : ModelSeries
target_name

Return target column name

Returns:target : object
transform(estimator, *args, **kwargs)

Call estimator’s transform method.

Parameters:

args : arguments passed to transform method

kwargs : keyword arguments passed to transform method

Returns:

returned : transformed result

tree

Property to access sklearn.tree

xgb

Property to access xgboost.sklearn API

xgboost

Property to access xgboost.sklearn API

class pandas_ml.core.generic.ModelPredictor

Bases: pandas_ml.core.generic.ModelTransformer

Base class for ModelFrame and ModelFrameGroupBy

decision

Return current estimator’s decision function

Returns:decisions : ModelFrame
estimator

Return most recently used estimator

Returns:estimator : estimator
log_proba

Return current estimator’s log probabilities

Returns:probabilities : ModelFrame
predict(estimator, *args, **kwargs)

Call estimator’s predict method.

Parameters:

args : arguments passed to predict method

kwargs : keyword arguments passed to predict method

Returns:

returned : predicted result

predicted

Return current estimator’s predicted results

Returns:predicted : ModelSeries
proba

Return current estimator’s probabilities

Returns:probabilities : ModelFrame
class pandas_ml.core.generic.ModelTransformer

Bases: object

Base class for ModelFrame and ModelFrame

fit(estimator, *args, **kwargs)

Call estimator’s fit method.

Parameters:

args : arguments passed to fit method

kwargs : keyword arguments passed to fit method

Returns:

returned : None or fitted estimator

fit_transform(estimator, *args, **kwargs)

Call estimator’s fit_transform method.

Parameters:

args : arguments passed to fit_transform method

kwargs : keyword arguments passed to fit_transform method

Returns:

returned : transformed result

inverse_transform(estimator, *args, **kwargs)

Call estimator’s inverse_transform method.

Parameters:

args : arguments passed to inverse_transform method

kwargs : keyword arguments passed to inverse_transform method

Returns:

returned : transformed result

transform(estimator, *args, **kwargs)

Call estimator’s transform method.

Parameters:

args : arguments passed to transform method

kwargs : keyword arguments passed to transform method

Returns:

returned : transformed result

class pandas_ml.core.groupby.GroupedEstimator(estimator, grouped)

Bases: pandas_ml.core.base._BaseEstimator

Create grouped estimators based on passed estimator

class pandas_ml.core.groupby.ModelFrameGroupBy(obj, keys=None, axis=0, level=None, grouper=None, exclusions=None, selection=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)

Bases: pandas.core.groupby.DataFrameGroupBy, pandas_ml.core.generic.ModelPredictor

transform(func, *args, **kwargs)

Call estimator’s transform method.

Parameters:

args : arguments passed to transform method

kwargs : keyword arguments passed to transform method

Returns:

returned : transformed result

class pandas_ml.core.groupby.ModelSeriesGroupBy(obj, keys=None, axis=0, level=None, grouper=None, exclusions=None, selection=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)

Bases: pandas.core.groupby.SeriesGroupBy

pandas_ml.core.groupby.groupby(obj, by, **kwds)

Class for grouping and aggregating relational data. See aggregate, transform, and apply functions on this object.

It’s easiest to use obj.groupby(...) to use GroupBy, but you can also do:

grouped = groupby(obj, ...)
Parameters:

obj : pandas object

axis : int, default 0

level : int, default None

Level of MultiIndex

groupings : list of Grouping objects

Most users should ignore this

exclusions : array-like, optional

List of columns to exclude

name : string

Most users should ignore this

Returns:

Attributes

groups : dict

{group name -> group labels}

len(grouped) : int

Number of groups

Notes

After grouping, see aggregate, apply, and transform functions. Here are some other brief notes about usage. When grouping by multiple groups, the result index will be a MultiIndex (hierarchical) by default.

Iteration produces (key, group) tuples, i.e. chunking the data by group. So you can write code like:

grouped = obj.groupby(keys, axis=axis)
for key, group in grouped:
    # do something with the data

Function calls on GroupBy, if not specially implemented, “dispatch” to the grouped data. So if you group a DataFrame and wish to invoke the std() method on each group, you can simply do:

df.groupby(mapper).std()

rather than

df.groupby(mapper).aggregate(np.std)

You can pass arguments to these “wrapped” functions, too.

See the online documentation for full exposition on these topics and much more

class pandas_ml.core.series.ModelSeries(data=None, index=None, dtype=None, name=None, copy=False, fastpath=False)

Bases: pandas.core.series.Series, pandas_ml.core.generic.ModelTransformer

Wrapper for pandas.Series to support sklearn.preprocessing

groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False)

Group series using mapper (dict or key function, apply given function to group, return result as series) or by a series of columns.

Parameters:

by : mapping function / list of functions, dict, Series, or tuple /

list of column names. Called on each element of the object index to determine the groups. If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups

axis : int, default 0

level : int, level name, or sequence of such, default None

If the axis is a MultiIndex (hierarchical), group by a particular level or levels

as_index : boolean, default True

For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output

sort : boolean, default True

Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. groupby preserves the order of rows within each group.

group_keys : boolean, default True

When calling apply, add group keys to index to identify pieces

squeeze : boolean, default False

reduce the dimensionality of the return type if possible, otherwise return a consistent type

Returns:

GroupBy object

Examples

DataFrame results

>>> data.groupby(func, axis=0).mean()
>>> data.groupby(['col1', 'col2'])['col3'].mean()

DataFrame with hierarchical index

>>> data.groupby(['col1', 'col2']).mean()
pp

Property to access sklearn.preprocessing. See pandas_ml.skaccessors.preprocessing

preprocessing

Property to access sklearn.preprocessing. See pandas_ml.skaccessors.preprocessing

to_frame(name=None)

Convert Series to DataFrame

Parameters:

name : object, default None

The passed name should substitute for the series name (if it has one).

Returns:

data_frame : DataFrame

Module contents