Data drift occurs when machine learning (ML) models are deployed in environments that no longer resemble the data on which they were trained. As a result of this change, model performance can deteriorate. In this blog post from the Software Engineering Institute (SEI) at Carnegie Mellon University, we introduce Portend, a new open source toolset from the SEI that simulates data drift in ML models and identifies the proper metrics to detect drift in production environments. Portend can also produce alerts if it detects drift, enabling users to take corrective action and enhance ML assurance. This post explains the toolset architecture and illustrates an example use case.
History
Publisher Statement
NO WARRANTY. THIS CARNEGIE MELLON UNIVERSITY AND SOFTWARE ENGINEERING INSTITUTE MATERIAL IS FURNISHED ON AN "AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.
[DISTRIBUTION STATEMENT A] This material has been approved for public release and unlimited distribution. Please see Copyright notice for non-US Government use and distribution.