Work Package 3

Creation of a common data structure and data access

The EuroFAANG common data access policy is an important counterpart to the infrastructure access policy developed in WP2. It will define the data handling expectations, data access requirements, conformation with data standards and FAIR (findable, accessible, interoperable and reusable) principles of all users of the EuroFAANG infrastructure via four main objectives:

  1. Define data management, policy and access principles for EuroFAANG in consultation with WP2.
  2. Define in collaboration with WP4-6, the requirements for metadata standards and ontologies to support emerging EuroFAANG technologies and infrastructures.
  3. To first map and identify gaps in existing data structure and data infrastructure, and then define and prototype the required data structures and information systems.
  4. Develop a shared strategy for management, sharing, and standardisation of animal agriculture data with Elixir.

The main goal of Task 1 is to define data management, policy and access principles for EuroFAANG.

A Data Management Plan (DMP) will be developed and published for EuroFAANG in accordance with Horizon guidelines. The DMP will cover all data generated as part of the concept development stage of
EuroFAANG, and then will be updated through subsequent project infrastructure stages. The DMP will document how generated data will be meet the FAIR principles, how it will be made accessible for verification and community re-use, and plans for long term preservation in the International Nucleotide Sequence Database Collaboration (INSDC) public archives and the EuroFAANG Data Portal. In addition to the DMP, a data policy
and set of access principles will be developed for the future users of the EuroFAANG infrastructure. This will
be a key counterpart to the Infrastructure Access Policy developed by WP2. It will define the data handling expectations and data access requirements for all users of the EuroFAANG infrastructure, with a focus on open science, FAIR data principles and existing European policies. This policy will cover the interoperability of the different infrastructure elements, connection to the Data Coordination Centre at EMBL, transfer and curation of data in the public archives and presentation of data through FAANG data portals and related services. The Data Policy and Access Principles will be published alongside the formalised Access Policy for the infrastructure.

The main goal of Task 2 is to develop requirements for metadata standards and ontologies to support emerging EuroFAANG technologies and meta research infrastructures.

The establishing EuroFAANG infrastructures require development of rich metadata standards, in line with
FAIR and Elixir principles, and provision of efficient data curation and information brokering platforms to
ensure long term preservation of generated data in the public INSDC archives and FAANG data portals. The
emerging standards will be developed in close collaboration with the community, through focussed workshops and consultation with key scientists and industry representatives. The requirements for FAIR metadata standards will be defined to support curation and biobanking of in vitro systems, genome editing, phenotyping and genomic technologies within the existing EuroFAANG metadata standards framework. Metadata standards will be aligned with Elixir principles and with key existing European groups such as the European Forum of Farm Animal Breeders (EFFAB) and the European Genebank Network for Animal Genetic Resources (EUGENA).

Finalised metadata standards will be published on the EuroFAANG website, the EuroFAANG Data Portal and the key curated Elixir resource FAIRsharing that already holds existing FAANG standards. The FAIRsharing resource highlights the current gaps in animal agriculture, with most existing sets being largely plant-orientated. Ontologies are a crucial component of an effective metadata recording, ontology requirements will be developed alongside the metadata standards for the merging infrastructures and EuroFAANG will coordinate development with initiatives such as AgBioData and the Elixir interoperability Ontology Lookup Service to fill in the current gap on corresponding farmed animal metadata management. As part of the Task 3 gap analysis a plan will be formed for the extension of the curation and data brokering platform for use by EuroFAANG infrastructures and users.

The main goal of Task 3 is to map and identify gaps in existing data structure and data infrastructure to support the developing EuroFAANG meta infrastructures.

Developing coordinated, secure and high-quality data management, structure, interoperability
and integration between the EuroFAANG establishing infrastructures and the EuroFAANG Data Coordination
Centre (DCC) is key for its success. A first phase will identify the gaps in existing FAANG and EuroFAANG
data structure and infrastructure that will be required to support the developing EuroFAANG meta
infrastructures (WP4-6), access requirements (WP2) and interoperability with other existing and developing
infrastructures (WP7). For example, for WP4, there is a key infrastructure gap in linkage from genetic data resources and archives to biorepositories of material. Enabling researchers and stakeholders direct access from a data record to obtain relevant biomaterial would be significant for facilitating and accelerating farmed animal research.

The second phase will include the requirements design and technical preparatory work required for the integration and e-infrastructure of the FAANG data platform to support new and emerging technologies and infrastructures for EuroFAANG. This will include development of data standards (Task 2) and validation services to ensure data generated by infrastructures is FAIR. validated and flows into the EuroFAANG data portal. Meeting the requirements of the establishing infrastructures will require connections and interoperability to be established with Elixir, EUGENA and EOSC. The close association of the EuroFAANG DCC with the INSDC public archives is critical for long-term sustainability and availability of generated data, and the designed information systems will support and enforce the data policy and access principles defined in Task 1.

The synergistic association with Elixir will provide excellent opportunities for community developer engagement and information system prototyping through access to Elixir hackathons, technical workshops and conferences that will widen engagement to technical knowledge beyond EuroFAANG research areas.

The main goal of Task 4 is the development of a shared strategy for management and standardisation of animal agriculture data with Elixir.

A shared strategy for management and standardisation of animal agricultural data will be developed between
Elixir and EuroFAANG. This will clarify synergies and responsibilities for data management, data production,
data sharing and data standards. Early within this design phase a joint workshop will be held on the Wellcome
Genome campus, that is the base for both the European Bioinformatics Institute outstation of EMBL and the
Elixir hub headquarters. This will minimise the travel requirements for each infrastructure, and when combined
with a hybrid meeting option will ensure maximal participation from scientists, industry representatives and
data systems specialists. The workshop will kickstart the development of a public report that will outline the
strategies of EuroFAANG and Elixir collaboration, interoperability and remit for data standards and
information systems for supporting agricultural G2P.

A second workshop will take place to continue discussions and will include invited experts from Elixir nodes from outside the project partners. The EuroFAANG and Elixir workshops will also explore the establishment of an animal agriculture Elixir community within the Elixir framework as part of Task 1 in Work Package 7, in which this work has important connections to that explores the formation of an Elixir animal agriculture community and development of cooperation with other research infrastructures that will interact with both EuroFAANG and Elixir, including EOSC and AQUAEXCEL. This process will also draw on synergies with plant agriculture and phenotyping efforts, both with the existing Elixir Plant Science Community and the EMPHASIS infrastructure within Europe, and US initiatives such as AG2PI and Agbiodata.

 

Keep in touch!

Receive the latest updates, opportunities and events and don't miss out on any progress of the project

EuroFAANG projects

We use them to give you the best experience. If you continue using our website, we will assume that you are happy to receive all cookies on this website. 

This website is using cookies