15-16th January 2016

me

open data hacks & science

Open Data

data.gov.uk and the openDEFRA initiative represent important progress

increasing volume of data becoming accessible

Open Data

data.gov.uk and the DEFRA initiative are important progress

increasing volume of data becoming accessible

- hard to keep up
- succesful untilisation rests on increasingly integrative approaches
- skills scattered across sectors

Open Data Hack


- increase awareness of data sources

- low pressure environments to bring together a diversity of backgrounds

- test ideas, think outside the box

- peer to peer learning, collaborative working

scientific community should participate

opportunity for science to contribute:


Help make sense of data!

Large observational datasets notoriously most difficult to work with


  • statistical inference
  • predictive models
  • specialist knowledge

opportunity to gain:

science increasingly computational


increasing potential to access and interact programatically with data
    - apis, data scraping
    - implications for research data management
    

opportunity to gain:

science increasingly computational


calls for open, reproducible research
    - documentation important
    - traceability, provenance
    - increasingly becoming important for publication
    

opportunity to gain:

science increasingly computational


Science could benefit from thinking more broadly about scientific outputs
  - interactive data visualisations 
  - web apps 
  - code
  
  becoming increasingly recognised as important scientific outputs

Increasingly a need for open data science approaches

    - more than statistical data analysis, 
    - encompassing aspects of 
        + data management 
        + data warehousing
        + reproducibility 
        + data best practices.
        

Advanced computational literacy & inter-disciplinary collaboration

current system disincentivizes this development.

  - scientific data management requires effort, often not rigorous enough
      + often lack of skills 
  - no reward for researchers who make studies transparent and reproducible. 
  

Perpetuating current practices will undermine scientific research, make it increasingly undiscoverable, fail to advance ever-more-diversified scientific fields.

Cross sector collaboration

Opportunities and challeges of Open Data across sectors interrelated


  • opportunity to get a better understanding of sector specific:

    • gaps and requirements
    • skill sets and expertise


  • opportunity for broad collaboration to develop diverse outputs: improve the way we access, manage , process, interact with, analyse and communicate data


  • opportunity to build strong local networks

Project directions

1:Respond to feedback from goverment bodies

MMO

KEY THEMES:

1: Describing the marine environment

- Evidence describing the current and future state of the marine area.
- Receives and processes information from a number of sources 
    - statutory bodies other government organisations, scientific advisors, industry, 
    trade bodies and wider public
- Further information required, improvements to existing information can be made
  • Data source
  • Mapping
  • Data improvement

MMO

KEY THEMES:

2: Interactions in the marine area

to facilitate sustainable development, marine management requires:

    - thorough understanding of overall effect of the interaction of human activities on the
    state of the marine area. 
    - should be a balanced and integrated evaluation incorporating the interlinked environmental,
    ecological, social and economic aspects of these interactions.
    
  • statistical inference
  • spatial modelling
  • ecosystem services

MMO

KEY THEMES:

3: Integrated management

essential to the ecosystem-based approach: integrating information into a management function

    - requires an integrated approach across different sector.
    - using evidence for decision making and applying methods to understand complex 
    information within a management framework.
    - produce sets of options that account for trade-offs and various policy and societal
    choices. 
  • data management and integration
  • predictive modeling
  • monitoring and evaluation tools

MMO Evidence gaps

A selection of goals from MMO Evidence Gaps document

MMO Evidence Strategy Sub theme MMO Priority Group Title Description
1.1 A The physical environment and implications for sediment disposal An up to date understanding of physical conditions in the marine area is necessary to inform marine licensing decision making; in particular, information is required on describing waves, current direction and tidal flow as well as bathymetry and exploring the implications of this for sediment disposal within designated sites
1.1 B Biodiversity 'hotspots' in the English marine plan areas Mechanisms to identify and describe 'hotspots' of biodiversity could allow marine managers to provide additional protection, e.g. through marine planning, to benefit the wider ecosystem. Such areas may be those which would support the functioning of the marine protected area network and/or may be sites which could in future be designated.
1.1 B Improved distribution and condition data for protected species and habitats and the uptake of this information into marine management Information is currently available to marine managers on the distribution and current condition of protected habitats and species. However, improvements to this information, including more detail regarding condition and improved resolution of distribution data, would improve it's use in marine management. In addition data improvements and development of mechanisms are required to more rapidly and efficiency incorporate on-going changes and improvements to data into marine planning and marine conservation work.

MMO Evidence gaps

A selection of goals from MMO Evidence Gaps document

MMO Evidence Strategy Sub theme MMO Priority Group Title Description
1.2 B Improving the resolution of live spatial fishing data MMO are undertaking the development of approaches such as iVMS to deliver improved resolution of live fishing activity data. This can enable the MMO to allow fishing activities closer to marine protected areas and thus better enable sustainable development within a framework of effective conservation management. Further exploration into effective and efficient mechanisms to gather live fishing data will be beneficial to marine management.
1.3 C The social and economic benefit of commercial and recreational fishing activity Development of methods to better describe and spatialise social and economic effort/landings for UK and non-UK fishing fleet is required to link fishing activity with the social and economic benefit that accrues from it.
1.4 D Geographic definition of the economic value of supply chains linked to marine activities Describing existing supply chains reliant on the marine area, their value and geographical distribution enables a description, in economic terms, of the supply chains linked to marine activities, both up and down supply chains, in order to better understand the full value of marine activities.

MMO Evidence gaps

A selection of goals from MMO Evidence Gaps document

MMO Evidence Strategy Sub theme MMO Priority Group Title Description
2.4 E Displacement in the marine area; the likelihood and impact of displacement of marine activities Not all marine sectors/activities can co-exist and thus it is inevitable that displacement occurs. A better understanding of displacement in the marine area could inform both marine licensing and marine plan development. Of particular interest is the likelihood of displacement between different sector pairings; the behaviour/response to displacement pressures; the type and magnitude of social, economic and environmental impacts of displacement; and the ability to activities to adapt to displacement with consideration of the likely magnitude of social, economic and environmental implications.
3.1 C Seasonal risks of marine activities: balancing social, economic and environmental impacts The required timing of marine works under marine licensing is generally driven by measures to mitigate environmental risks, e.g. risks to seasonable birds or migrating fish. This often means that works take place during summer months which could potentially lead to a larger impact on social and economic factors (e.g. tourism) than might occur during other seasons. A process to evaluate the risk to tourism and balance against other risks (environmental, social and economic) in addition to the development of potential approaches to mitigate the social and economic risk within the licensing framework as driven by current legislation and policy could better ensure sustainable development.

Data

MMO Data

Data sources & formats:


Key datasets:

Habitat data | AIS Vessel Density & Tracklines | Fish Landings by ICES rectangle | Fish Landings to United Kingdom Ports | Marine Management Organisation Marine Licences and Applications | Administrative boundaries UKHO & Marine plan areas | OS OpenData | Office for National Statistics (Open Geography Portal boundary data allows the mapping of ONS statistics to admin areas)

2:OpenScience Tools

Programmatic access & processing

R - open source statistical programming language becoming increasingly important in data science


@ROpenSci

  • a network of #rstats users building tools to facilate open science in R

  • collaborative open coding - repos publicly available on github

develop API, data scraping and munging tools

3:Data Science perspectives

web apps, visualisations, infographics

The science of art

to be continued…

@Sheffield_R biodiversity hacks

National Biodiversity Hack Event

Series of hacks throughout March 2016 focussing on the NBN Gatweway and package @ROpenSci package rnbn


Open Data Science Initiative @datascienceshef

Data Hive events and further hacks

Open up the #HACK

Hack resources