Unlocking Data to Achieve Smarter Outcomes
[author] [author_image timthumb=’on’]http://mwhconstructors.com/wp-content/uploads/2015/06/Flatters-Mike-e1435168970251.png[/author_image] [author_info]Written by Mike Flatters, New Initiatives and Innovation Manager at MWH Global.[/author_info] [/author]
Effective infrastructure management relies on making sound evidence based decisions. Decision makers need access to the right information or data, in the right format, at the right time so that it can be interpreted and used to inform the decision making process.
Any disconnect between the information available, the timing, and the interpretation, can result in lost opportunities, reduced performance or poor decisions. Also, too much information that’s difficult to interpret can result in a lack of clarity and also impede the decision making process.
Data science can help
This is where the application of data science can assist. Data science is a relatively new discipline that applies big data analysis principles and technologies to real business needs and outcomes.
[box] “Without good information, elected members and their communities will struggle to understand, and make effective choices about, their future needs. Local authorities need to build their capability to more effectively use asset condition and network performance information so that robust decisions can be made about managing infrastructure needs into the future.” New Zealand Auditor-General, 2014[i][/box]
Effective infrastructure management requires an evidence-based approach toward decision making. However, the challenge issued above by the New Zealand Auditor-General, pinpoints an infrastructure management problem that hinders many organisations, not just local authorities. That is, infrastructure managers often cannot access or produce the timely, high quality information needed for their planning, investment and maintenance activities. This can be especially frustrating if relevant data is known to exist, but is somehow ‘locked’ or disconnected from the decision making process. This unnecessarily reduces performance and increases the risk of lost opportunities and error.
Locked data occurs in two main ways:
- Inadequate data processing, analysis and presentation skills to extract relevant information from raw data, especially as the volume and complexity of data increases.
- Needlessly closed data systems, where useful data sits in compartmentalised silos that cannot be readily accessed or combined with other data.
Unlocking the data
For most organisations, it is far easier to collect data than it is to appropriately examine, interrogate and interpret that same data. This gap looks destined to widen as the volume and variety of raw data balloons, with 90% of the world’s data created in the last two years alone, according to the McKinsey Global Institute (Manyika et al., 2011[ii]). Infrastructure data is also likely to follow this trend, as ever higher resolution data is collected to measure and document the state of assets.
Because of this exponential growth in data, increasingly specialised skills have become required to unlock insights hidden within data and has led to the emergence of the relatively new discipline of Data Science. As the name suggests, Data Science adopts a scientific, research-oriented, approach to knowledge creation from data, with a primary focus on delivering results to drive operational and strategic activities. Consequently, Data Scientists require a blend of skills and experience that range across software engineering, statistics, business analytics and domain-specific knowledge. These skills are used to design, implement and operate data analysis ‘pipelines’ that transform raw input and extract actionable results.
Common insight ‘extraction’ techniques used by Data Scientists include:
- Visualisation – Graphical display of information to convey results clearly and effectively. Information delivered in chart form is typically easier to understand than the same information presented in text or tables. Because of their explanatory power, visualisation is frequently used to communicate the outputs from analyses or diagrammatically describe processes to decision makers or the public.
- Simulation – Modelling of real world processes to examine the behaviour of a system. Having an accurate ‘end-to-end’ representation of a system has many applications. Perhaps the most important is the ability to stress test a system to identify process weaknesses or capacity limits, without the risk of damage to the actual system itself.
- Prediction – Forecasts about the likelihood of events under given scenarios. Estimates about anticipated trends are essential for planning purposes. Knowing future deterioration or demands on infrastructure allow for appropriate resources to be dedicated to maintenance and replacement programmes.
- Optimisation – Identification of the best course of action from a range of alternatives, after consideration of operational criteria and restrictions. Making optimal decisions can be extremely difficult when there are multiple inputs, project complexities, resource constraints and competing objectives. Common optimisation scenarios include: Profit maximisation given finite resources to deliver products or services; Cost minimisation whilst maintaining agreed quality standards; and planning efficient project workflows to avoid bottlenecks and delays. For public infrastructure works, multi-criteria optimisation is often required to identify the best balance between many, often competing, concerns, for example minimising environmental impact and maximising energy production of a new hydroelectric dam.
Data access and integration issues have led to the international growth of the Open Data movement, especially when data collection has been publicly funded. In addition to more efficient linkage, sharing and reuse, freely accessible and open data can increase transparency, accountability, trust and innovation within communities (Open Knowledge Foundation, 2012; The World Bank, 2013[iii]). Accordingly, several central governments have actively promoted Open Data principles, including Australia, U.S.A., G8 and New Zealand.
Central and regional governments, cities and organisations around the world are recognising the benefits of open data and have implemented online Open Data services such as data.govt.nz from New Zealand and data.london.gov.uk from London. A central feature of all these portals is that data can be retrieved via Application Programming Interfaces (APIs) that allow developers, analysts and programmers to access data in an automatable manner. As such, data can be effectively gathered and refreshed without the need for inefficient and error-prone manual downloads.
Open Data services are a much welcome start to better data accessibility, and the New Zealand Government’s Chief Information Officer outlines several case studies where open data has led to innovative and novel reuse of public data. These range from the ANZ bank predicting economic cycles 6 months in advance using light vehicle traffic volumes from the Transport Agency, to enabling Data Journalists at the NZ Herald newspaper to publish statistics laden articles (NZ Government Chief Information Officer, 2015[iv]). Of particular note, open data has resulted in efficiency gains being reported by 72% of Government departments through the reuse of data that had been opened from other public agencies (NZ Government Chief Information Officer, 2014[v]).
Open Data within New Zealand is however in its infancy and the 2,980 available datasets currently listed at data.gov.nz are clearly a tiny fraction of data collected by the public sector. It is also no surprise that local authority datasets are significantly underrepresented in the data.gov.nz catalogue, given the Auditor-General’s recommendation that many need to improve how they connect critical pieces of information together (New Zealand Auditor-General, 2014).
Openness of data alone however, is not sufficient to create reliable information. The quality of data being used is also essential, as captured by the hackneyed expression, “Garbage In, Garbage Out”. Advice from PWC on behalf of the European Commission, suggest that the necessary elements of data quality include (Dekkers et al., 2013[vi]):
- Accuracy: Does the data correctly represent the real-world entity or event?
- Consistency: Is the data free of contradictions?
- Availability: Can the data be accessed now and over time?
- Completeness: Are all data items fully captured?
- Conformance: Does the data follow accepted standards? (e.g. metadata definitions)
- Credibility: Is the data based on trustworthy sources?
- Processability: Is the data machine-readable?
- Relevance : Does the data include an appropriate amount of data?
- Timeliness: Is data published soon enough?
Based on the findings of the Auditor-General, the quality of asset information held by many local authorities do not meet these criteria, which is a major concern given that they manage $98Bn of fixed assets, or almost 50% of New Zealand’s public infrastructure (Department of Internal Affairs, 2014[vii]). Until there are dramatic improvements in data openness and quality, coupled with appropriate analytical expertise, it is likely that many local authorities will not manage their asset infrastructure in a cost-effective manner.
Because of the scale of expenditure associated with public infrastructure, public authorities are expected to demonstrate that decisions are evidence-based, robust and have appropriately addressed civic concerns such as:
- Prioritisation of projects that deliver large net benefits
- Undertaking meaningful and informed consultation
- Public accountability
- Obtaining value for money
- Holistic assessment of issues and their downstream effects
- Quantification of risks and benefits across all plausible scenarios
- Evaluation of the opportunity costs associated with projects that proceed (or don’t)
Transparency about how a decision was reached, the evidence that was assessed and how that evidence was assembled is therefore a high public concern. At present though, full, end-to-end, and public visibility of data and analysis behind infrastructure projects is very uncommon. However, driven by the word wide push for publishing Open Data, it’s likely that Open Analysis will become increasingly expected of publically funded projects. Open Analysis would require all data, source code logic, models and commentary used to create results, to be publicly released. This in turn promotes scrutiny, lets different scenarios and assumptions to be tested and allows work to be reused across other projects. To maximise public involvement, Open Analysis work should ideally use open-source software and programming languages, to avoid the need for other parties to purchase commercial licences in order to reproduce work. Only with Open Analysis of this degree can the public properly assess and validate the logic used to guide infrastructure decisions, understand the risks and assumptions embedded within those decisions and truly participate as informed partners in the decision making process.
Building trust in decision making
The volume and diversity of data is exploding and many organisations are struggling to extract timely and meaningful information from it. New Zealand local authorities are no exception, with most not using the full functionality of their asset management information systems according to the Auditor General, who highlighted they need to improve their ability to (New Zealand Auditor-General, 2014):
- Have good information about the condition and performance of their assets;
- Integrate that information with financial and service delivery decisions and risk management; and
- Link their spending on maintenance and renewals to an optimised decision-making approach.
These are significant criticisms when New Zealand local authorities are responsible for the management of $98Bn worth of fixed assets. Given the scale of the problem, Data Science techniques and Open Data systems are likely to be the most effective techniques for addressing those challenges. Data integration and information extraction from large volumes of raw, messy data requires considerable skill, meaning that data scientists will become increasingly involved in the analysis of infrastructure assets. Open Data systems will also make data integration substantially easier and allow local authorities to more readily undertake fundamental analyses themselves.
With the move toward greater openness and transparency across government, there will be a growing push for Open Analysis to be completed for publically funded projects, whereby all data and work is shared for public scrutiny and benefit. This will allow smart analysis that unlocks data to be more effectively recycled elsewhere, increase community engagement and build trust in decisions.
[i] New Zealand Auditor-General, 2014. Water and roads: funding and management challenges. Office of the Auditor-General, Wellington.
[ii] Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.H., 2011. Big Data: The Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute.
[iii] Open Knowledge Foundation, 2012. Open Knowledge: What is Open? [WWW Document]. URL https://okfn.org/opendata/ (accessed 4.7.15).
The World Bank, 2013. Briefing on Open Data declarations.
[iv] NZ Government Chief Information Officer, 2015. Open Data Case Studies [WWW Document]. URL https://www.ict.govt.nz/guidance-and-resources/case-studies/open-data/ (accessed 4.7.15).
[v] NZ Government Chief Information Officer, 2014. 2014 Report on Agency Adoption of the Declaration on Open Government Data | ICT.govt.nz [WWW Document]. URL https://www.ict.govt.nz/guidance-and-resources/open-government/declaration-open-and-transparent-government/2014-report-on-adoption-of-the-declaration/ (accessed 4.7.15).
[vi] Dekkers, M., Loutas, N., De Keyzer, M., Goedertier, S., 2013. Open Data & Metadata Quality.
[vii] Department of Internal Affairs, 2014. Briefing to Incoming Minister of Local Government [WWW Document]. URL http://www.dia.govt.nz/pubforms.nsf/URL/Local-Government-BIM-2014.pdf/$file/Local-Government-BIM-2014.pdf (accessed 4.7.15).