Automated anomaly detection processor for biologic terrorism early detection —Hampton, Virginia
Introduction: Disease surveillance databases can range in size into the terabytes, making rapid, meaningful analysis and conclusions about the data impracticable and expensive. Robust, automated, nontemplate-based real-time processing techniques capable of monitoring large-scale disease, health-care, and environment tracking and surveillance data sets are needed to discriminate between naturally occurring events and emergent diseases or biologic terrorist attacks.
Objectives: This study evaluated the ability of an automated anomaly detection processor to detect a simulated anthrax attack during influenza season.
Methods: This report describes the application of data-mining techniques in developing an Automated Anomaly Detection Processor (AADP), which uses the Self Organizing Map clustering algorithm in conjunction with a Gaussian Mixture Model and a Bayesian Analyzer probabilistic model to detect anomalous occurrences in health data sets.
The test case for the model is a data set from the BioWar Model (Carnegie Mellon University) and is based on real-life census data, medical records, and social and behavioral patterns in Hampton, Virginia. The data files include sales from 16 pharmacies in 25 product categories, absenteeism records from 34 school and 982 work sites, and medical insurance records of the residents. The BioWar data are unique because they contain a simulated biologic weapons attack and human response against a background of naturally occurring illness. In addition, two data files contain time series of nonquantitative observables (e.g., International Classification of Diseases, Ninth Revision codes).
Results: AADP identified a simulated biologic terrorism attack occurring during the influenza season. First detection of the anthrax outbreak occurred approximately 4.7 days after the attack. AADPs for any pharmacies deemed to be anomalous yielded a drill-down table identifying the relative contributions of the variables causing the anomaly. The population of the pharmacy AADPs yielded an excessive number of anomalous pharmacies simultaneously after the simulated attack began. Pharmacies began to turn anomalous initially adjacent to the attack site. Drill-down of the anomalies indicated shared patterns of sales of several categories of pharmaceutical products. These results indicate a systematic cause rather than a random correlation of probability of anomaly. In addition, anomalous periods of extended absenteeism were detected soon after the attack.
Conclusion: Development of AADP for biosurveillance adds a complementary method to extant surveillance systems and can improve real-time alerting so assets can be vectored for further epidemiologic investigation and early intervention. These results are prompting additional query of the anomalous events and inclusions of additional data streams to improve early warning and response.
* This work is supported by the U.S. Army Medical Research and Materiel Command under Contract No. DAMD17-03-C-0061. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the U.S. Department of Defense.
D. MichaeL Thomas, (1) S. Arouh, (1) K. CarLey, (2) J. Kraiman, (1) J. Davis (1) (1) Dynamics Technology, Inc., Arlington, Virginia; (2) Carnegie Mellon University, Pittsburgh, Pennsylvania
Corresponding author: D. Michael Thomas, Dynamics Technology, Inc., 1555 Wilson Blvd, Ste 703, Arlington, VA 22209. Telephone: 703-841-0990; Fax: 703-841-8385; E-mail: firstname.lastname@example.org.
Disclosure of relationship: The contributors of this report have disclosed that they have no financial interest, relationship, affiliation, or other association with any organization that might represent a conflict of interest. In addition, this report does not contain any discussion of unlabeled use of commercial products or products for investigational use.