Skip to content

Locify input csv fromUri column in excelerator #36

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 9, 2019

Conversation

jyucsiro
Copy link
Contributor

@jyucsiro jyucsiro commented Dec 5, 2019

This PR features enhancements to enable users to use a CSV file that has location codes as input rather than Loc-I URIs.

Often source data will have the location codes (i.e. ASGS 2016 MeshBlock codes) and values for that particular dataset. Currently users have to manually convert the values in the column with location codes to Loc-I URIs (e.g. from 50055290000 to http://linked.data.gov.au/dataset/asgs2016/meshblock/50055290000). This is a barrier as users may not know how to manually do the string conversion.. but they should know which location type it is. So users selecting which dataset and datatype the location code are from the dropdown box in the UI is a more user-friendly way to go.

This PR provides code that will prepend the Loc-I datatype prefix to the location code, so excelerator can map location code to Loc-I URI based on the prefixes for each dataset and dataset type.

Related to #29

@jyucsiro jyucsiro self-assigned this Dec 5, 2019
Copy link
Contributor

@shaneseaton shaneseaton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a high level, I am not convinced this is good. Without question I think this makes the task 'a bit easier', but I think it allows people to forgo the 'education' of working with Loci URI's. Forcing people to use the URI's changes their mind set and encourages them to start using the "correct" data. Once they start using the URIs, they flow through their data into other areas. System databases, etc. It also shows people where they can find out more about the data.

Hiding these details is essentially hiding Loci, and really I think we are trying to increase visibility.

Personally, I would prefer a tool that converted their data into the right format as an extra step, if you wanted to help them learn, and use the tool.

I have approved this as it looks like it should do what you want, just not convinced it what Loci big picture wants.

@jyucsiro
Copy link
Contributor Author

jyucsiro commented Dec 5, 2019

At a high level, I am not convinced this is good. Without question I think this makes the task 'a bit easier', but I think it allows people to forgo the 'education' of working with Loci URI's. Forcing people to use the URI's changes their mind set and encourages them to start using the "correct" data. Once they start using the URIs, they flow through their data into other areas. System databases, etc. It also shows people where they can find out more about the data.

A bit of a chicken-and-egg situation. I agree that URIs should be the ultimate identifier across systems, but that requires education. That puts up a barrier to short-term adoption of the Loc-I approach/tools. There is a question as to whether it's the role of end-users to add these prefixes to make them Loc-I identifiers/URIs.

Meanwhile, end-users already have this data or source this data from the ABS, AURIN, and other data publishers in this form. I feel we want to show them that via Loc-I, these transformations are possible in a backwards compatible way with data that they are downloading from other sources or that already exists in their workflow.

My hunch is that the flow of URIs will primarily come at the data provider end, and at places where data may be warehoused internally in an organisation. It wouldn't be a big change to the data providers process. Once that happens, end-users will receive data with Loc-I URIs and then just use them.

Worth discussing this further, but for now.

Hiding these details is essentially hiding Loci, and really I think we are trying to increase visibility.

I don't feel like we're hiding this - as the converted data returns with Loc-I identifiers, so it's visible there and provides a point for educating end-users.

Personally, I would prefer a tool that converted their data into the right format as an extra step, if you wanted to help them learn, and use the tool.

I think we need this tool too, but that is a bigger task.

I have approved this as it looks like it should do what you want, just not convinced it what Loci big picture wants.

Thanks. There is the future user study activity and engagement which will help inform our team on whether this is desireable... Explore the option and we can decide to keep/drop this then.

@dr-shorthair
Copy link

Thanks @shaneseaton for putting this perspective.
The general principle is good, and yes, persistent web-compatible identifiers are the glue that holds the system together.
But they are not things that end-users should be tangling with.
Hyperlinks on web-pages hide the URL behind anchor text or an image.
Technically oriented people can find the URL easily, but most people - include the Loc-I end-users - do not need to tangle with them in order to use Loc_I services.
Developers need to tangle with URIs, JSON, XML, RDF etc.
But CSV is a step up from that, closer to technical end users (our scientists) for whom Excel is an everyday tool.

@jyucsiro
Copy link
Contributor Author

jyucsiro commented Dec 6, 2019

Tests using data excerpts from AURIN's aggregation to SA1 of Population & Dwelling Counts 2016 Census for Australia from ABS (original data - https://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/2074.02016?OpenDocument)

... is working well for excelerator input and conversions out to different geographies.

population-sa1-2016-test.zip

@jyucsiro jyucsiro merged commit 4868104 into master Dec 9, 2019
@jyucsiro jyucsiro deleted the jonathan/feature/locify-input-code branch December 9, 2019 00:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants