Data Reduction
Laboratories
VTTI data reduction laboratories process the raw data collected in VTTI studies and reduce it into a format that allows specific research questions to be accurately and effectively answered.
Capabilities
Data Services uses two full-time data reduction laboratories that will contain 32 data reduction stations when they are fully staffed. To maintain data security, all data reduction machines are connected only to the data server and maintain no outside connections.
Software
Research data on the Storage Area Network (SAN) can be accessed using a dedicated, high-speed SQL server, and via special application servers using Microsoft Sequel Server®, MatLab®, and SAS®. In order to facilitate data reduction activities, VTTI has developed proprietary data-viewing software to allow synchronized viewing of driver performance (parametric) data and video/audio streams. This system allows researchers and data reductionists to work directly with large databases while providing ease of use and improved accuracy.
Data Reductionists
Data reductionists undergo a rigorous training and evaluation process before they are approved to reduce data for projects. Additionally, each reductionists' data is periodically spot checked by the data supervisor to ensure quality standards are maintained. For all studies, a series of testing protocols is in place to ensure that data reduction is consistent both between reductionists and from a single reductionist over time. These tests are conducted on a regular basis to ensure that there is no drift in the quality of reduced data.
Data Reduction Tools
VTTI has developed software tools to significantly reduce the time required to analyze eyeglance and other video-based data. These software tools allow the data reductionists to examine the data and insert additional information such as eyeglance locations and lengths, answers to questions about the driving environment, and specific information about near-crashes or crashes before, during, and after the event. The video is also synched with the kinematic data to allow detailed crash and near-crash analysis.
Data Services Center
The Data Services Center of VTTI is at the forefront of High Performance Computing.
Research is driven by data collection.
VTTI has quickly become an industry leader in naturalistic driving research, and this has created the need for advanced data services.
We work with collected raw sensor data such as speed, distance and geopositioning which must then be plotted, coupled, and synchronized with real-time video.
The data in this form is then analyzed for any number of variables, including research participants' driving behaviors and responses. The recently completed 100-Car Study, for instance, produced over 6 Terabytes (TB) of data more than 2,000,000 vehicle miles and 42,300 miles of driving, a vast amount of data. That is only one example of our many completed and ongoing projects.
How We Meet This Challenge:
Data Reduction
The Data Reductionists are tasked with turning raw data into a format from which researchers may glean real answers to specific questions. Data Reductionist training is rigorous and quality is scrupulously maintained by data supervisor spot-checks and regular consistency checks to ensure the quality and accuracy of the processed data over time.
Two full-time data reduction laboratories are dedicated to this effort, housing workstations with a direct connection to the data center. For security, no outside connections are allowed.
The programmers at VTTI have developed proprietary software which allows the simultaneous viewing of participant metrics with streaming video and audio, all synchronized. Additional software tools we have created allow faster analysis of eyeglance data. The reductionists can also insert additional information about eyeglance lengths and distances, about the driving environment, and specific information about near-crashes and crashes, before, during, and after the event. The video is synchronized with the kinematic data to allow detailed crash and near-crash analysis.
In addition to our proprietary software, research data can be accessed using a dedicated, high-speed SQL server, and via special application servers using Microsoft SQL Server®, MatLab®, and SAS®.
Support and Resources
Two groups work cooperatively to operate the Data Service Center; the Center for Technology Development's Data Services Group and VTTI's Information Technology Operations group.
The Data Services Group handles uploading of collected data, the management and large-scale storage of data, data tracking, data quality control and assurance. Our proprietary software programmers work under this group's auspices, providing maintenance as well. Another set of programmers see to database building, optimization, and maintenance.
IT Operations determines specifications and acquires hardware and software, maintains the hardware, repairing and updating it as needed. Other responsibilities include network security and access control management, data backup and archiving. In addition, the group configures and maintains data reductionist workstations and providing user support to the Data Reductionists.
It takes big teeth to crunch big numbers.
The Data Services back-end is actually housed in two locations, one within VTTI itself, the other at the Virginia Tech Data Center, and features between the two sites:
- 40+ high-performance servers
- Over 100 TB of redundant high-speed storage with a full Petabyte available for video
- A large-capacity backup system that includes a tape library capable of handling 14.4 Gigabytes of data per minute.
- High-speed 10-Gigabit Ethernet for all connections. VTTI pioneered the implementation of this technology to every portal.
The VTTI facility includes:
- Emergency power provided by an onsite diesel generator supplying backup power for the data center, emergency lighting and the telecommunications closet.
- An external 10 Gigabyte wide area network connecting to such High Performance Computing Networks as National Lambda Rail and Internet2.
- A dedicated climate control system with a backup contingency system.
- Remote monitoring alarm system to indicate fire, smoke, intrusion, power outage, climate control failure, hardware failure, water presence, high temperature, and other situations.
- Elevated flooring system.
- High physical security with steel reinforced structural walls.
- Limited personnel accessibility.
Virginia Tech Data Center is located within the 51,000 square-foot Andrews Information Systems Building (AISB) in the Virginia Tech Corporate Research Center. This building is a secure facility with emergency power, an electronic access control system, surveillance cameras, and, after normal business hours, security guards. The Data Center is protected against fire by a Halon gaseous automatic fire suppression system. (Information courtesy of Virginia Tech Advanced Research Computing)
In addition VTTI boasts:
- Two 600 square foot secure data reduction laboratories housing high-end workstations.
- 250+ Dell branded business class desktops, laptops and laboratory workstations.
The Future
We will be putting a Teraflop of computing power toward facial recognition. Facial recognition will allow us to identify individual drivers within a study and link them with the real-time data collected during their activities, ultimately helping to refine analysis.
Other advancements are on the horizon.
