Carnegie Mellon University
Data-driven Building Metadata Inference.pdf (12.24 MB)

Data-driven Building Metadata Inference

Download (12.24 MB)
posted on 2016-05-01, 00:00 authored by June Young Park

Building technology has been developed due to the improvement of information technology. Specifically, a human can control and monitor the building operation by a number of sensors and actuators. The sensors and actuators are installed on every single element in a building. Thus, the large stream of building data allows us to implement both quantitative and qualitative improvements. However, there are still limitations to mapping between the physical building element and cyber system. To solve this mapping issue, last summer, a text mining methodology was developed as part of a project conducted by the Consortium for Building Energy Innovation. Building data was extracted from building 661, in Philadelphia, PA. The ground truth of the building data point with semantic information was labeled by manual inspection. And a Support Vector Machine was implemented to investigate the relationship between the data point name and the semantic information. This algorithm achieves 93% accuracy with unseen building 661 data points. Techniques and lessons were gained from this project, and this knowledge was used to develop the framework for analyzing the building data from the Gates Hillman Center (GHC) building, Pittsburgh PA. This new framework consists of two stages. In the first stage, we initially tried to cluster the data points by similar semantic information, using the hierarchical clustering method. However, the effectiveness and accuracy of the clustering method is not adequate for this framework. Thus, the filtering and classification model is developed to identify the semantic information of the data points. From the filtering and classification method, it correctly identifies the damper position and supply air duct pressure data point with 90% accuracy by daily statistical features. Having the semantic information from the first stage, the second stage figures out the relationship between Variable Air Volume (VAV) terminal units and Air Handling Units (AHU). The intuitive thermal and flow relationship between VAVs and AHUs are investigated at the beginning, and the statistical features clustering method is applied from the VAV discharge temperature data. However, the control strategy of this building makes this relationship invisible. Alternatively we then compared the similarity between damper position at VAVs and supply air duct pressure at AHUs by calculating the cross correlation. Finally, this similarity scoring method achieved 80% accuracy to map the relationship between VAVs and AHUs. The suggested framework will guide the user to find the desired information such as the VAVs – AHUs relationship from the problem generated by a large number of heterogeneous sensor networks by using data-driven methodology.




Degree Type

  • Master's Thesis


  • Architecture

Degree Name

  • Master of Science (MS)


Azizan Aziz,Bertrand Lasternas

Usage metrics



    Ref. manager