Locify input csv fromUri column in excelerator #36

jyucsiro · 2019-12-05T12:43:36Z

This PR features enhancements to enable users to use a CSV file that has location codes as input rather than Loc-I URIs.

Often source data will have the location codes (i.e. ASGS 2016 MeshBlock codes) and values for that particular dataset. Currently users have to manually convert the values in the column with location codes to Loc-I URIs (e.g. from 50055290000 to http://linked.data.gov.au/dataset/asgs2016/meshblock/50055290000). This is a barrier as users may not know how to manually do the string conversion.. but they should know which location type it is. So users selecting which dataset and datatype the location code are from the dropdown box in the UI is a more user-friendly way to go.

This PR provides code that will prepend the Loc-I datatype prefix to the location code, so excelerator can map location code to Loc-I URI based on the prefixes for each dataset and dataset type.

Related to #29

excelerator/imports/api/methods-init.js

shaneseaton

At a high level, I am not convinced this is good. Without question I think this makes the task 'a bit easier', but I think it allows people to forgo the 'education' of working with Loci URI's. Forcing people to use the URI's changes their mind set and encourages them to start using the "correct" data. Once they start using the URIs, they flow through their data into other areas. System databases, etc. It also shows people where they can find out more about the data.

Hiding these details is essentially hiding Loci, and really I think we are trying to increase visibility.

Personally, I would prefer a tool that converted their data into the right format as an extra step, if you wanted to help them learn, and use the tool.

I have approved this as it looks like it should do what you want, just not convinced it what Loci big picture wants.

jyucsiro · 2019-12-05T23:27:49Z

At a high level, I am not convinced this is good. Without question I think this makes the task 'a bit easier', but I think it allows people to forgo the 'education' of working with Loci URI's. Forcing people to use the URI's changes their mind set and encourages them to start using the "correct" data. Once they start using the URIs, they flow through their data into other areas. System databases, etc. It also shows people where they can find out more about the data.

A bit of a chicken-and-egg situation. I agree that URIs should be the ultimate identifier across systems, but that requires education. That puts up a barrier to short-term adoption of the Loc-I approach/tools. There is a question as to whether it's the role of end-users to add these prefixes to make them Loc-I identifiers/URIs.

Meanwhile, end-users already have this data or source this data from the ABS, AURIN, and other data publishers in this form. I feel we want to show them that via Loc-I, these transformations are possible in a backwards compatible way with data that they are downloading from other sources or that already exists in their workflow.

My hunch is that the flow of URIs will primarily come at the data provider end, and at places where data may be warehoused internally in an organisation. It wouldn't be a big change to the data providers process. Once that happens, end-users will receive data with Loc-I URIs and then just use them.

Worth discussing this further, but for now.

Hiding these details is essentially hiding Loci, and really I think we are trying to increase visibility.

I don't feel like we're hiding this - as the converted data returns with Loc-I identifiers, so it's visible there and provides a point for educating end-users.

Personally, I would prefer a tool that converted their data into the right format as an extra step, if you wanted to help them learn, and use the tool.

I think we need this tool too, but that is a bigger task.

I have approved this as it looks like it should do what you want, just not convinced it what Loci big picture wants.

Thanks. There is the future user study activity and engagement which will help inform our team on whether this is desireable... Explore the option and we can decide to keep/drop this then.

dr-shorthair · 2019-12-06T00:14:56Z

Thanks @shaneseaton for putting this perspective.
The general principle is good, and yes, persistent web-compatible identifiers are the glue that holds the system together.
But they are not things that end-users should be tangling with.
Hyperlinks on web-pages hide the URL behind anchor text or an image.
Technically oriented people can find the URL easily, but most people - include the Loc-I end-users - do not need to tangle with them in order to use Loc_I services.
Developers need to tangle with URIs, JSON, XML, RDF etc.
But CSV is a step up from that, closer to technical end users (our scientists) for whom Excel is an everyday tool.

jyucsiro · 2019-12-06T00:31:39Z

Tests using data excerpts from AURIN's aggregation to SA1 of Population & Dwelling Counts 2016 Census for Australia from ABS (original data - https://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/2074.02016?OpenDocument)

... is working well for excelerator input and conversions out to different geographies.

population-sa1-2016-test.zip

Jonathan Yu and others added 2 commits December 5, 2019 23:27

adding prefix to initialisation data structure

6602e01

modify convert to convert fromUri values to URI if it is not a URI

f7d6d85

jyucsiro added the Type: Enhancement label Dec 5, 2019

jyucsiro self-assigned this Dec 5, 2019

jyucsiro added 2 commits December 5, 2019 23:44

remove logging

03f986c

remove duplicate if statement

815491c

jyucsiro requested review from shaneseaton and benjaminleighton December 5, 2019 13:01

benjaminleighton approved these changes Dec 5, 2019

View reviewed changes

shaneseaton reviewed Dec 5, 2019

View reviewed changes

excelerator/imports/api/methods-init.js Outdated Show resolved Hide resolved

shaneseaton approved these changes Dec 5, 2019

View reviewed changes

use function to map asgs11 and gnaf16 prefix info instead of copy-paste

74b1f69

jyucsiro merged commit 4868104 into master Dec 9, 2019

jyucsiro deleted the jonathan/feature/locify-input-code branch December 9, 2019 00:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Locify input csv fromUri column in excelerator #36

Locify input csv fromUri column in excelerator #36

jyucsiro commented Dec 5, 2019 •

edited

Loading

shaneseaton left a comment

jyucsiro commented Dec 5, 2019

dr-shorthair commented Dec 6, 2019

jyucsiro commented Dec 6, 2019

Locify input csv fromUri column in excelerator #36

Locify input csv fromUri column in excelerator #36

Conversation

jyucsiro commented Dec 5, 2019 • edited Loading

shaneseaton left a comment

Choose a reason for hiding this comment

jyucsiro commented Dec 5, 2019

dr-shorthair commented Dec 6, 2019

jyucsiro commented Dec 6, 2019

jyucsiro commented Dec 5, 2019 •

edited

Loading