simplefit.eda
Module Contents
Functions
|
This function creates numerical distribution plots on either all the numeric columns or the ones provided to it |
|
This function creates correlation plot for all the columns in the dataframe |
|
This function creates SPLOM plot for all the numeric columns in the dataframe or the ones passed by the user |
- simplefit.eda.plot_distributions(data, bins=40, dist_cols=None, class_label=None)[source]
This function creates numerical distribution plots on either all the numeric columns or the ones provided to it
- Parameters:
data (pandas.DataFrame) – The dataframe for which distribution plot has to be created
bins (int) – The number of bins for histogram plot
dist_cols (list, optional) – The subset of numeric columns for which the histogram plots have to be generated
class_label (str, optional) – The name of the target column only in case of classification dataset. For regression dataset, it is not required
- Returns:
The Altair object for the plot
- Return type:
chart_numeric
Examples
>>> plot_distributions(data) >>> plot_distributions(data, dist_cols=['loudness', 'acousticness'], class_label='target')
- simplefit.eda.plot_corr(data, corr='spearman')[source]
This function creates correlation plot for all the columns in the dataframe
- Parameters:
data (pandas.DataFrame) – The dataframe for which distribution plot has to be created
corr (str) – The correlation method, which can be among ‘spearman’, ‘kendall’ or ‘pearson’ The default value is spearman
- Returns:
The Altair object for the plot
- Return type:
corr_plot
Examples
>>> plot_corr(data) >>> plot_corr(data, corr='kendall')
- simplefit.eda.plot_splom(data, pair_cols=None)[source]
This function creates SPLOM plot for all the numeric columns in the dataframe or the ones passed by the user
- Parameters:
data (pandas.DataFrame) – The dataframe for which distribution plot has to be created
pair_cols (list) – The list of dataframe columns, for which correlation plot is to be generated
- Returns:
The Altair object for the plot
- Return type:
splom_chart
Examples
>>> plot_splom(data) >>> plot_splom(data, pair_cols=['loudness', 'acousticness', 'energy'])