代码改变世界

Berkeley Sensor Database总结

2011-11-16 20:53  shy.ang  阅读(321)  评论(0编辑  收藏  举报

http://sensor.berkeley.edu/

Berkeley Sensor Database

l  Based on Observation Data Model(ODM), MySQL, Apache Web Server and Perl

Modified ODM Schema

Data Loader

Web Interface

Administrative and Reporting functions

 

Data Loader

runs hourly to check for new measurements and then populate the Sensor Database.

performs a basic “sanity check” on incoming values, and issues email alerts to the datasdream’s contack if invalid conditions are found.

functions to be performed below:

  1. Checks for new timestamps arriving from the field (compares new timestamps to the latest one in the DB)
  2. Checks for new devices and logger configuration changes in the logger file
  3. Gets metadata from the database for each new measurement (i.e., variable name and units, station name, contact, etc.)
  4. Performs sanity checks: is the new value within the device's specs range? has the device stopped reporting?
  5. Flags any values that fail the sanity check, prepares alerts
  6. Converts raw data to geophysical units as required
  7. Assigns a Data Quality Level (raw, converted, etc.) for each datastream
  8. Inserts new records into the MySQL database
  9. Updates statistics

User Interface

       Web Interface interacted with the Sensor Database, Apache Web Server, MySQL, Perl software, Perl modules. Users can:

  • Query all data, or only specific stations(s) or variables(s)
  • View query results as a graph, or optionally in a table
  • Download query results, or in bulk by station
  • View all metadata (stations, methods, sites, etc.)
  • View data statistics
  • Report "incidents" (such as damage to a device) that affect data quality
  • Add and edit all metadata
  • Read documentation about the database

Controlling Access to the Data

       web-based: login

       “people” table: access level 1-4

       “Station”: access code 1-4

       sql queries with access code are dynamically generated.

Data Integrity

The requirements for our system are: 
(1) All versions of data (raw, converted, derived, and corrected) must be stored. Converted, corrected and derived levels of data should be annotated. 
(2) Users must be able to flag questionable data. 
(3) The system must do a basic "sanity check" on incoming data. 
(4) Any flags or comments on the data should be displayed with query results.

 

Data Quality Levels

       QualityControlLevelCode is assigned to each incoming measurement by the Data Loader.

Data Qualifiers

       QualifierID is assigned to each incoming measurement by the Data Loader.

 

Administrative and Reporting Modules

monitor the workflow,update statistics,refresh data caches,and issue reports.

Workflow Monitor

  • datalogger polling: A server in the field polls dataloggers hourly to collect new data, and retrieves it over a wireless network.
  • data transfer from the field: Once per hour, the field server transmits new data to a staging area at UC Berkeley.
  • data loading: Once per hour, the Data Loader checks the staging area for new data and loads it into the database as described above.

Updating Statistics

Refreshing Data Caches

Bulk Data Download:

       The Growth of the database makes querying large spans of data take a long time

       “Bulk Data Download”: all the data for a specific station can be downloaded as a Zip file.

Caching Metadata:

       Minimize the number of queries needed to satify a user request.

       Cache metadata