Carnegie Mellon University
Browse

Time Series Analysis in Drug Product Development and Statistical Forecasting

Download (8.79 MB)
thesis
posted on 2022-11-16, 22:51 authored by Brad JohnsonBrad Johnson

The statistical discipline of time series analysis excels at modeling and forecasting the dynamics of complex systems that exhibit non-random correlations between successive process measurements that can not be captured with traditional, continuous-time regression. Typical time series analysis applications are found in business operations, econometrics, and the natural and social sciences. Statistical time series models do not exist in a vacuum: they may be hybridized with a physical process’s first-principles model or combined with several alternative models in an ensemble. This thesis combines concepts from parameter estimation, dynamic modeling, and time series analysis to address open research questions in the fields of pharmaceutical powder flow simulation and time series forecasting.

Screw feeders, the first unit operation in continuous manufacturing of drug product (CMDP) processes, influence the mass flow rate and concentrations of pharmaceutical powders downstream. A well-known bottleneck in CMDP process development is the dependence on experimentally fit flowsheet models. These models cannot simulate realistic, highly-variable flow rates like the more computationally expensive discrete element method (DEM) models and each experimental fit has limited generalizability beyond its configuration. Being able to predict and simulate realistic mass flow dynamics enables engineers to begin design upon characterization and inform experimentation, ultimately speeding up a CMDP’s time-to-market. To this end, we developed a parameter estimation algorithm and hybrid deterministic-stochastic flowsheet model that simulates a realistic flow rate without a high computational or experimental burden. Analyzing the deterministic model errors across many volumetrically-fed experiments revealed autocorrelation between samples and a consistent leptokurtic, heavy tailed distribution. Assuming homoscedasticity, the stochastic behavior of feeder powder flow can be modeled with a three-parameter autoregressive moving average (ARMA) model with mean-zero Laplace noise. The set of estimated hybrid model parameters can serve as response data for developing a predictive dynamic feeder model. Additionally, a predictive dynamic model for feeder flow rate was found using best subset selection (BSS) to learn a weighted sum of nonlinear transformations of powder properties, feeder parameters, and operating state.

Forecasts of key business metrics are used to make important decisions every day. To improve prediction accuracy, several different forecasts are aggregated. However, there is no consensus on the best way to combine these forecasts. This is further complicated by human experts adjusting the predictions or the combination weights manually. We proposed an algorithm to select which forecasts to combine and estimate their weights by solving the BSS mixed-integer optimization problem. My approach allows for the novel encoding of expert judgment as constraints, improving user confidence in the resulting forecasts. In practice, my methods increase modeling flexibility while still being able to identify highly accurate combinations of forecasters.


History

Date

2022-02-23

Degree Type

  • Dissertation

Department

  • Chemical Engineering

Degree Name

  • Doctor of Philosophy (PhD)

Advisor(s)

Nikolaos V. Sahinidis