The development of effective treatments for chronic diseases, such as heart disease, diabetes, cancer, and Alzheimer’s, is a key global healthcare challenge. These conditions are primary drivers of healthcare spending and have a deleterious effect on quality of life and mortality. Fortunately, they are inextricably linked to modifiable health-related behavioral risk factors such as smoking, poor diet, and lack of physical activity. Mobile health technology  provides a novel tool for supporting lifestyle modifications via behavioral interventions that can provide personalized guidance to patients, thereby reducing their risk factors and improving health outcomes.
While app-based therapies based on self-report data (i.e. EMA) are increasingly available, passive sensing from wearables holds the promise of continuously monitoring key physiological and contextual variables, thereby providing long-term insight into patterns of behavior and risk at low burden to participants. We are developing novel biomarkers from wearable sensors to capture physiological and contextual variables, and machine learning methods for modeling longitudinal data. We work with the mDOT and MD2K  centers and have a broad set of collaborators in engineering, behavioral science, and statistics, with a focus on smoking, sedentary behavior, and substance abuse. We currently work in three research areas:
- Biomarkers for physiological and contextual risk factors
- Predictions of adverse outcomes from event data
- Modeling and imputing missing measurements in longitudinal data
Biomarkers for physiological and contextual risk factors
We are collaborating with the Inan Lab in developing novel biomarkers for hemodynamic parameters, such as pulmonary capillary wedge pressure, and in developing predictive models for heart failure (HF) clinical status (i.e. compensated vs decompensated). Decompensation in HF is a key driver of rehospitalization, resulting in increased mortality risk and cost of care. Using an implantable device to measure pulmonary artery pressure and titrate treatment has been shown to reduce the risk of decompensation, but the procedure is invasive, expensive, and unsuitable for some patients. We are exploring ballistocardiography and seismocardiography as a means to estimate heart failure status by measuring and analyzing the small vibrations that propagate through the body due to the mechanical action of the heart. Ballistocardiography (BCG) uses a modified weighing scale, and we have demonstrated classification of HF decompensation at an AUC of 0.78 using BCG data collected in a home environment . Seismocardiography uses an accelerometer placed on the chest and is currently under investigation.
We are also developing biomarkers for contextual variables based on passively-collected signals from wearable sensors. We have collaborated with the MD2K center in developing a biomarker for smoking opportunity, which captures the situational factors (such as proximity to locations in which smoking is permitted) which can precipitate relapse during a quit attempt . We have also developed a novel biomarker that uses a wearable camera to quantify exposure to screens, a known risk factor for sedentary behavior . Below we illustrate several examples of attention to TVs and mobile devices inferred from wearable camera video (without eye tracking). A key challenge in contextual sensing is the accurate synchronization of multiple wearable modalities, and in particular wearable cameras, to a common timeline. We have developed an automatic synchronization approach, known as SyncWISE, and have demonstrated its effectiveness in aligning video and accelerometry signals [Yun-Synch].
Predictions of adverse outcomes from event data
Many common temporal data sources in clinical and mobile health applications consist of event data, in which measurements occur at arbitrary, irregularly-distributed time points. Visits to the clinic for assessment, which are common in treating conditions such as glaucoma or autism, rarely occur on a fixed schedule. Likewise, in mobile health applications it is common to perform Ecological Momentary Assessments (EMA) at randomly-chosen times throughout the day. We have developed two approaches to modeling such data:
- Latent state models based on Continuous Time HMM (CT-HMM)
- Deep learning models based on transformers
A CT-HMM is an HMM in which both the transitions between hidden states and the arrival of observations can occur at arbitrary (continuous) times, rather than on a fixed clock. We have developed efficient parameter learning methods for CT-HMMs  and have demonstrated their utility in modeling the progression of glaucoma and Alzheimer’s disease (illustrated below). We have developed a spatiotemporal CT-HMM approach to modeling glaucoma progression, which can describe the temporal history of 2D visual field and retinal thickness maps  and accurately predict future patient status as a tool for treatment management. We have developed a predictive model for relapse in substance abuse by combining a CT-HMM model of latent behavioral states with a survival model for risk prediction .
We are currently developing transformer-based models for predicting EMA noncompliance. It is extremely common and useful to gather self-reported data from participants, for example as EMAs in behavioral studies and as electronic patient-reported outcomes (ePRO) in clinical trials. Noncompliance, in which participants fail to provide data at select time points, is a significant and well-recognized problem. We are developing a deep learning architecture which analyzes a sequence of EMA responses, along with other contextual cues, and predicts the likelihood that a participant will be noncompliant to the next EMA prompt. Our investigation leverages a novel dataset in which participant availability is automatically-assessed from sensor data prior to triggering EMA prompts, making it possible to distinguish noncompliance from overall nonresponse. We have shown that it is possible to predict noncompliance at an AUC of 0.76.
Modeling and imputing missing measurements in longitudinal data
Problems with missing data are endemic to clinical and mobile health applications. We are developing principled strategies for imputing missing observations from temporal data in the context of the hierarchical computations that naturally arise in mHealth biomarker inference. As a first step along these lines, we have addressed the problem of missing observations in panel count data. In many mHealth applications, participants are asked to self-report counts of behaviors of interest (e.g. number of cigarettes smoked since the last self-assessment). This is an example of a more general setting of modeling counting processes for adverse behavioral outcomes. We have recently addressed the problem of modeling the mean function of a counting process from panel count data with missing counts . We developed a simple functional EM algorithm to wrap existing panel count data methods, and extended finite sample parametric EM theory to the functional setting via a novel use of Gateaux derivatives. Our results suggest that the standard imputation methods may be under-estimating actual counts.
- V. B. Aydemir , S. Nagesh, M. M. H. Shandhi, J. Fan, L. Klein, M. Etemadi, J. A. Heller, O. T. Inan, and J. M. Rehg, Classification of Decompensated Heart Failure From Clinical and Home Ballistocardiography. IEEE Transactions on Biomedical Engineering, vol. 67, no. 5, pp. 1303–1313, 2020.
- Chatterjee, S. , Moreno, A., Lizotte, S. L., Akther, S., Ertin, E., Fagundes, C. P., ... & Kumar, S. (2020). SmokingOpp: Detecting the Smoking Opportunity Context Using Mobile Sensors. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(1), 1-26.
- Kumar, S. , Abowd, G. D., Abraham, W. T., Al’Absi, M., Gayle Beck, J., Chau, D. H., Condie, T., Conroy, D.E., Ertin, E., Estrin, D., Ganesan, D., Lam, C., Marlin, B., Marsh, C.B., Murphy, S.A., Nahum-Shani, I., Patrick, K., Rehg, J.M., Sharmin, M., Shetty, V., Sim, I., Spring, B., Srivastava, M., Wetter, D. W. (2015). Center of excellence for mobile sensor data-to-knowledge (MD2K). Journal of the American Medical Informatics Association, 22(6), 1137–1142.
- Liu, Y. Y. , Li, S., Li, F., Song, L., & Rehg, J. M. (2015). Efficient learning of continuous-time hidden markov models for disease progression. In Advances in neural information processing systems (pp. 3600-3608).
- Dempsey, W. H. Dempsey, W. H., Moreno, A., Scott, C. K., Dennis, M. L., Gustafson, D. H., Murphy, S. A., & Rehg, J. M. (2017). iSurvive: an interpretable, event-time prediction model for mHealth. Proceedings of machine learning research, 70, 970.
- Moreno, A. , Wu, Z., Yap, J. R., Lam, C., Wetter, D., Nahum-Shani, I., ... & Rehg, J. M. (2020). A Robust Functional EM Algorithm for Incomplete Panel Count Data. Advances in Neural Information Processing Systems, 33.
- Nagesh, S., Moreno, A., Ishikawa, H., Wollstein, G., Schuman, J. S. (2019). A Spatiotemporal Approach to Predicting Glaucoma Progression Using a CT-HMM. Machine Learning for Healthcare Conference (pp. 140-159).
- Rehg, J. M. , Murphy, S. A., & Kumar, S. (2017). Mobile Health: Sensors, Analytic Methods, and Applications. Springer.
- Yun C. Zhang , James M. Rehg. Watching the TV watchers. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 2, Article 88 (July 2018).
- Yun C. Zhang*, Shibo Zhang*, Miao Liu, Elyse Daly, Samuel Battalio, Santosh Kumar, Bonnie Spring, James M. Rehg, and Nabil Alshurafa. 2020. SyncWISE: Window Induced Shift Estimation for Synchronization of Video and Accelerometry from Wearable Sensors. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 4, 3, Article 107 (September 2020).