logo

API Reference

Get started with the Censius Python SDK

To install the latest Python package, you can simply use pip
python
pip install censius
You can also install a specific version of the censius package,
python
pip install censius==VERSION_HERE

Initialization

censius_client()

The first step is to instantiate a Client object with an API key and a tenant ID that we’ve provided your enterprise. This is successively used to authenticate every call made using the SDK.
The API and the Tenant key can be found in the Settings tab on the console.
python
from censius import CensiusClient, ModelType, DatasetType, ExplanationType, Dataset client = CensiusClient(api_key = *YOUR_API_KEY*, tenant_id = *YOUR_TENANT_ID*)

Setup

register_project()

Projects contain multiple models (and their versions) that relate to a common business use case. They also help manage access control when sharing a group of models with other users.
Arguments
Type
Description
name
string
The name of the project
Required
python
client.register_project( name = 'Movies' )

register_dataset()

You can use the register_dataset API to register a dataset to the Censius platform.
Arguments
Type
Description
name
string
The name of the dataset
Required
file
dataframe
File that stores feature values. Right now, only CSV files are supported.
Required
project_id
int
The ID of the project in which you want to register the dataset
Required
features
list<dict>
The list of columns for the dataset. It is a list of dictionaries, each containing two keys, one name and the other type. Valid values of type is DatasetType.STRING, DatasetType.INT, DatasetType.BOOLEAN, DatasetType.DECIMAL.
Required
timestamp
dict
The default timestamp column to be processed, if it is part of the dataset.The accepted timestamp types are DatasetType.UNIX_MS , DatasetType.UNIX_NS , DatasetType.UNIX_S and DatasetType.ISO
Optional
python
import pandas as pd dataframe_object=pd.read_csv("titanic.csv") client.register_dataset( name = 'titanic_dataset', file = dataframe_object, project_id = 121, features = [ { "name": "Survived", "type": DatasetType.INT }, { "name": "Pclass", "type": DatasetType.INT }, { "name": "Sex", "type": DatasetType.STRING }, { "name": "Age", "type": DatasetType.DECIMAL }, { "name": "SibSp", "type": DatasetType.INT }, { "name": "Parch", "type": DatasetType.INT }, { "name": "Fare", "type": DatasetType.DECIMAL } ], timestamp={"name":"Timestamp","type":DatasetType.UNIX_MS} )
💽
Download the titanic dataset that we use in the example here.

register_model()

You can use this API to register a new model to the Censius platform. For subsequent updates to the model with new versions, register_new_model_version() should be called.
Arguments
Type
Description
model_id
string
The ID of the model
Required
model_name
string
The name of the model
Required
model_type
enum
This is the type of the targets of the model. Currently supported value is ModelType.BINARY_CLASSIFICATION and ModelType.REGRESSION
Required
model_version
string
A string to represent the version of the model
Required
training_info
dict
Recording the ID of the dataset the model is trained on
Required
project_id
int
The ID of the project in which the model belongs
Required
targets
list<string>
These are the columns the model predicts.
Required
features
list<string>
They are the columns the model uses to predict the targets
Required
python
client.register_model( model_id = "titanic_m", model_name = "titanic model", model_type = ModelType.BINARY_CLASSIFICATION, model_version = "v1", training_info = { "method": Dataset.ID, "id": 262 }, project_id = 121, targets = ["Survived"], features = ["Pclass", "Sex", "Age","SibSp","Parch","Fare"] )
⚠️
model_id must be unique across an entire tenant. Combination of model_id and model_version must be unique as well.

register_new_model_version()

You can use this API to add a new version to an existing model. Example, “v2” of a model.
Arguments
Type
Description
model_id
string
The ID of the model
Required
model_version
string
A string to represent the version of the model
Required
training_info
dict
Recording the ID of the dataset the model is trained on
Required
project_id
int
The ID of the project in which the model belongs
Required
targets
list<string>
These are the columns the model predicts.
Required
features
list<string>
They are the columns the model uses to predict the targets
Required
python
client.register_new_model_version( model_id = "titanic_m", model_version = "v2", training_info = { "method": Dataset.ID, "id": 262 }, project_id = 121, targets = ["Survived"], features = ["Pclass", "Sex", "Age","SibSp","Parch","Fare"] )
⚠️
 model_version must be unique here as you are registering a new version of an existing model

Logging predictions, features, and explanations

log()

This function enables logging individual predictions, features (and optionally explanations). It can be integrated as part of the production environment to log these values as predictions are made.
Arguments
Type
Description
prediction_id
string
The ID of this prediction log. This can be used to update the actual of this log later
Required
model_id
string
The model ID against which you want to log the prediction
Required
model_version
string
The version of the model against which you want to log the prediction
Required
features
dict
A dict with feature names as keys and processed feature values as values.
Required
prediction
dict
A dictionary containing feature headings as keys, and a dict that contains two keys, label and optionally, confidence as values. For example, ”Loan Status”: {”label”: 2, "confidence": 0.2}
Required
timestamp
int
UNIX epoch timestamp in milliseconds or time.time.now() to indicate the current time.
Required
actual
dict
A dictionary containing actual for the prediction log. The keys are the target features and the values are the ground truth values of the feature.
Optional
💡
When using time.time.now() remember that the time is calculated in UTC on the client-side, not the server side.
Logging a single prediction
python
client.log( prediction_id = "05e75d5771e54c936068987a8aaa117", model_id = "titanic_m", model_version = "v1", features = { "Pclass": 2, "Sex": "male", "SibSp": 2, "Parch": 3, "Fare": 15, "Age": 15 }, prediction = { "Survived": {"label": 1, "confidence": 1} }, timestamp = 1659331134574 )
💡
Logs are currently being aggregated at every 60 mins by default. This can be changed in custom deployments—reach out to us if you need a different frequency.

log_actual()

If the actual wasn't available when log() was called, it can be updated at a later time using log_actual(). This can be the case for certain types of models where the ground truth isn't immediately available.
Arguments
Type
Description
prediction_id
int
The prediction ID against which you want to update the actual
Required
actual
dict
A dictionary containing actual to be updated. The keys are the target feature headings and the values are the ground truth values of the feature.
Required
model_id
string
The model ID for the prediction for which you need to update the actual
Required
model_version
string
The model version for the prediction for which you need to update the actual
Required

💡
Keys in the actual attribute should match the target attribute of the model. For example, if your model target column is Loan, when updating actual, the actual attribute should be a dict of the format{”Loan”: ACTUAL_VALUE}
python
client.update_actual( prediction_id = '05e75d5771e54c936068987a8aaa117', model_id = "titanic_m", model_version = "v1", actual = { "Survived": 1, } )

log_explanations()

Arguments
Type
Description
prediction_id
int
The prediction ID against which you want to update the actual
Required
model_id
string
The model ID for the prediction for which you need to update the actual
Required
model_version
string
The model version for the prediction for which you need to update the actual
Required
explanation_type
enum
The type of explanation. Currently supports ExplanationType.SHAP
Required
explanation_values
dict
A dictionary containing features and their explanations. The keys are the target feature headings and the values are the explanation values.
Required
python
client.log_explanation( model_id = "v1", model_version = "titanic_m", prediction_id = "05e75d5771e54c936068987a8aaa117", explanation_type = ExplanationType.SHAP, explanation_values = { "Pclass": 0.544, "Sex": 0.722, "SibSp": 0.112, "Parch": 0.053, "Fare": 0.344, "Age": 0.622 } )

Bulk Log Insertion

bulk_log()

This function enables you to send the predictions, actuals and explanations logs in bulk. It can be integrated as part of the production environment where you are collecting the model logs and send them altogether in a single insertion call (something like once in a day frequency).
💡
Any one of predictions, actuals and explanations details must be present in the bulk_log call.
Arguments
Type
Description
input
Pandas Dataframe
Pandas DataFrame of bulk logs containing predictions, actuals, and explanations values.
Required
model_id
string
The model ID against which you want to log the bulk insertion.
Required
model_version
string
The version of the model for which you want to log the bulk insertion.
Required
prediction_id_column
string
Name of the <ID> column in input DataFrame. The values of this columns must be NOT NULL & Unique.
Required
predictions
object
The object used is Prediction.Tabular this collect information regarding the predictions and feature columns in the input DataFrame. More details in Prediction.Tabular table below.
optional
actuals
string
Name of the column in input DataFrame which refers to the values of Actual.
optional
explanations
object
The object used is Explanation.Tabular this collect information regarding the explanations values, explanation type, and feature columns in the input DataFrame. More details in Explanation.Tabular table below.
optional
Prediction.Tabular
Arguments
Type
Description
timestamp_column
timestamp
Name of the column which specify the timestamp for each prediction in the input DataFrame.
Required
prediction_column
string
Name of the column which specify the Prediction values in the input DataFrame. This column must be NOT NULL.
Required
prediction_confidence_column
float
Name of the column which specify the prediction_score value in the input DataFrame. This column must be NOT NULL.
Required
features
list<object>
List of object with a mapping of registered features to column names in the input DataFrame. Example: {"feature": "Age" , "input_column": "age_in_years"} Here, “Age” was mentioned while registering model and “age_in_years” is a column in DataFrame which corresponds “Age” feature values in bulk_logs.
Optional
Explanation.Tabular
Arguments
Type
Description
type
enum
The type of explanation. Currently supports ExplanationType.SHAP
Required
explanation_mapper
list<object>
List of object with a mapping of registered features to column names in the input DataFrame. Example: {"feature": "Age" , "input_column": "age_shap"} Here, “Age” was mentioned while registering model and “age_shap” is a column in DataFrame which corresponds to SHAP values of “Age” feature in bulk_logs.
Required
python
from censius import CensiusClient, Prediction, Explanation, ExplanationType import pandas as pd BULK_LOG_CSV_PATH = "<path-to-csv>" DATAFRAME = pd.read_csv(BULK_LOG_CSV_PATH) client.bulk_log(input = DATAFRAME, prediction_id_column = "log_id", model_id = "<model_id>", model_version = "<model_version>", predictions = Prediction.Tabular( timestamp_column = "Timestamp", prediction_column = "prediction_survived", prediction_confidence_column = "prediction_confidence", features = [ {"feature": "Age" , "input_column": "age_in_years"}, {"feature": "Sex", "input_column": "sex_details" }, {"feature": "Pclass", "input_column": "pclass_details" }, {"feature": "SibSp", "input_column": "sibsp_details" }, {"feature": "Parch", "input_column": "parch_details" }, {"feature": "Fare", "input_column": "fare_details" }, ], ), actuals = "actual_survived", explanations = Explanation.Tabular( type = ExplanationType.SHAP, explanation_mapper = [ { "feature": "Age" , "input_column": "age_shap"}, { "feature": "Sex", "input_column": "sex_shap" }, { "feature": "Pclass", "input_column": "pclass_shap" }, {"feature": "SibSp", "input_column": "sibsp_shap" }, {"feature": "Parch", "input_column": "parch_shap" }, {"feature": "Fare", "input_column": "fare_shap" }, ] ) )

Updating model metadata

update_model()

If there is a model retrain in the production environment, you can use this function to mark the start and end time of the production data that the model was retrained on.
Arguments
Type
Description
model_id
string
The ID of the model for which you want to update the rolling window
Required
model_version
string
The version of the model for which you want to update the rolling window
Required
training_info
dict
Start and end UNIX epoch timestamps for the window of retrain
Required
python
client.update_model( model_id = "titanic_m", model_version = "v1", training_info = { "method": Dataset.FIXED, "start_time": 1648837800000, "end_time": 1648837941232 } )

Helpful?