Skip to content

model_tests.FEAT.SHAPFeatureImportance

SHAPFeatureImportance Objects

@dataclass
class SHAPFeatureImportance(ModelTest)

Test if the subgroups of the protected attributes are the top ranking important variables under shapely feature importance value.

To pass, subgroups should not fall in the top n most important variables.

The test also stores a dataframe showing the results of each groups.

Arguments:

  • attrs - List of protected attributes.
  • threshold - Threshold for the test. To pass, subgroups should not fall in the top n (threshold) most important variables.
  • test_name - Name of the test, default is 'Shapely Feature Importance Test'.
  • test_desc - Description of the test. If none is provided, an automatic description will be generated based on the rest of the arguments passed in.

get_shap_values

def get_shap_values(model, model_type, x_train_encoded, x_test_encoded) -> list

Get SHAP values for a set of test samples.

Arguments:

  • model - Trained model object.
  • model_type - type of model algorithm, choose from 'trees' or 'others'
  • x_train_encoded - Training data features, categorical features have to be encoded.
  • x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.

shap_summary_plot

def shap_summary_plot(x_test_encoded, save_plots: bool = True)

Make a shap summary plot.

Arguments:

  • x_test_encoded - Data to be used for shapely explanations, categorical features have to be encoded
  • save_plots - if True, saves the plots to the class instance

get_result

def get_result(model, model_type: str, x_train_encoded: pd.DataFrame, x_test_encoded: pd.DataFrame) -> pd.DataFrame

Output a dataframe containing the test results of the protected attributes.

Arguments:

  • model - Trained model object.
  • model_type - type of model algorithm, choose from 'trees' or 'others'
  • x_train_encoded - Training data features, categorical features have to be encoded.
  • x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.

shap_dependence_plot

def shap_dependence_plot(x_test_encoded, show_all: bool = True, save_plots: bool = True)

Create a SHAP partial dependence plot to show the effect of the individual subgroups on shapely value.

Arguments:

  • x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.
  • show_all - If false, only show subgroups that failed the test.

run

def run(model, model_type: Literal["trees", "others"], x_train_encoded: pd.DataFrame, x_test_encoded: pd.DataFrame) -> bool

Runs test by calculating result and evaluating if it passes a defined condition.

Arguments:

  • model - Trained model object.
  • model_type - type of model algorithm, choose from 'trees' or 'others'
  • x_train_encoded - Training data features, categorical features have to be encoded.
  • x_test_encoded - Test data to be used for shapely explanations, categorical features have to be encoded.