Overview: Monitoring AI

Why Monitor AI

AI Monitoring is a relatively novel practice in the enterprise. This is because AI is in the process of gradually transitioning from proof-of-concept projects to industry-grade products that are expected to meet minimum standards.

The transition is quite significant since authorities and lawmakers have started focusing on building AI-specific laws. China and Brazil already have proper AI laws in place and Europe has proposed an extensive AI Act to finally start regulating AI.

AI Monitoring is a critical requirement to regulate and build trustworthy AI, which is how it gained high traction in recent years. Monitoring AI ensures:

The results are similar in both development and production environments

The workforce can focus on more strategic tasks instead of manually supervising AI

The number of faults is minimized and caught immediately

ROI is consistent for longer cycles

The time-to-resolve an issue is brought down significantly

The end-user is least impacted by ups and downs in the system

How AI Monitors Work

AI Monitors are:

Automatic: AI Monitors should ideally have auto-initialization capabilities along with customization options. Once set up, AI monitors can automatically monitor every corner of the solution and manage alerts without any human intervention.

Fast & Flexible: AI Monitors work in real-time to catch fatal faults in the solution. They are required to be flexible enough to accommodate different monitoring schedules. While some solutions might require constant supervision, others might do well with hourly monitoring.

Descriptive: AI Monitors are set up to catch issues in the system that could otherwise impact the end-user experience. If the monitoring alerts are not descriptive enough or have properly defined segments, it will get challenging to get a full picture of the issue.

All-encompassing: AI Monitors dive into every corner of the solution at hand. While they can monitor the solution as a whole, they are also able to pick up narrow data segments or model versions and track them closely if necessary.

Types of AI Monitors

AI models and datasets have several vitals that needs to be taken care of. Based on these sensitive points, AI Monitors can be segregated into four types on a high level:

1️⃣ Model Performance Monitors

Performance monitors take care of standard performance metrics such as accuracy, sensitivity, specificity, F1 score, etc. Users can either use auto-initialized generic performance monitors or customize them for specific data segments or threshold values.

It is however important to note that Enterprise AI is not always run by standard metrics. In fact, most business and development teams use highly specific business metrics to gauge the performance of the models. Censius AI Observability Platform, unlike most, covers both standard and custom metrics.

2️⃣ Drift Monitors

Drift is the event of shifting from an expected pattern. There are two types of drifts:

Data Drift: A shift in the distribution of features in training and serving data while the relationship between the input and target is unchanged.

Concept Drift: A type of model drift observed due to the changes in the properties of dependent variables or the target to be predicted over time.

Censius AI Observability Platform facilitates the setting up of custom alerts and thresholds to trigger user notifications. As soon as drift is detected, the platform alerts users and reminds them to take the next course of action, which might include adding new training data, model retraining, or model redevelopment. Concept drift monitor uses the ‘Early Drift Detection Method’ to monitor the frequency of output classes and notice significant variations.

3️⃣ Data Quality Monitors

Data is the backbone or core of any AI Application. This makes AI extremely dynamic since data is highly dynamic in nature and due to the constant change, the model that’s wrapped around the data also needs to evolve accordingly.

There are several categories of data quality monitors that check different data parameters:

Missing value monitors: Missing value monitor tracks what the user would consider missing values i.e NaN, -1, null, unk in any feature.

Data range monitors: Data range monitors observe violations in the minimum and maximum values of a feature, compared to a chosen baseline.

New value monitors: New Value monitors observe violations in the categories of a feature by noting new values that occur in that feature compared to the baseline.

4️⃣ Activity Monitors

The volume of the predictions and actuals is vital to understanding the health of the data endpoints. If some data points are getting blocked due to some issues such as hardware failure, human errors, or model disruptions, it significantly impacts the end-user experience.

Activity monitors track the amount of data the model processes over various time periods. They catch a drop or rise in volume early on and report to the assigned team members.

📜

Learn more at AI Observability