The Azimuth Project
Azimuth Data Project (Rev #2, changes)

Showing changes from revision #1 to #2: Added | Removed | Changed

The Azimuth Data Project is an online project to develop expertise in data sets for the environmental sciences, in data engineering, and in the mathematical foundations of data.

Eventually, we hope to be able to “give back” to the scientific and educational community, by:

  • Enhancing the value of existing datasets through further processing

  • Contributing to the development of open-source data storage, access and analysis platforms

  • Publishing selected datasets of educational interest

Like the Azimuth Code Project, this is an “umbrella” project – a loose framework within which people can initiate specific projects that interest them.

The data project can be symbiotic with the code project. It can provide goals for software to be developed by the code project. And it can provide data to be used as input for software models and simulators.

If you would like to participate, we need you! Leave a comment on (this thread) at the Azimuth Forum.

Topics Areas

  • Existing scientific datasets

  • Data methodology issues

  • Ontologies

  • Mathematical foundations of data

    • Foundations in relational algebra, universal algebra, category theory, etc.
    • Foundations of empirical data – probability, statistics, stochastic processes
  • Data engineering

    • Data representation technologies
    • Algorithms for data representation and transformation
    • Technology for Big Data
  • Dataset production

  • Data modelling, and applications of data within the sciences.

  • Data interfaces for simulation systems

Eventually, we may end up exploring the empirical transformation of data by nature, through the use of model simulation programs.

Initial Agenda

  • Conduct survey of important data sets used in the environmental sciences

    • Create wiki pages

    • Write blog article with survey results

  • Conduct survey of what scientists are using what data sets for what purposes

  • Solicit information on how the data sets can be improved through further processing

  • Survey blog on software platforms for data storage and access, especially as it is applied to big data for environmental science