-
Notifications
You must be signed in to change notification settings - Fork 1
Locify input csv fromUri column in excelerator #36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a high level, I am not convinced this is good. Without question I think this makes the task 'a bit easier', but I think it allows people to forgo the 'education' of working with Loci URI's. Forcing people to use the URI's changes their mind set and encourages them to start using the "correct" data. Once they start using the URIs, they flow through their data into other areas. System databases, etc. It also shows people where they can find out more about the data.
Hiding these details is essentially hiding Loci, and really I think we are trying to increase visibility.
Personally, I would prefer a tool that converted their data into the right format as an extra step, if you wanted to help them learn, and use the tool.
I have approved this as it looks like it should do what you want, just not convinced it what Loci big picture wants.
A bit of a chicken-and-egg situation. I agree that URIs should be the ultimate identifier across systems, but that requires education. That puts up a barrier to short-term adoption of the Loc-I approach/tools. There is a question as to whether it's the role of end-users to add these prefixes to make them Loc-I identifiers/URIs. Meanwhile, end-users already have this data or source this data from the ABS, AURIN, and other data publishers in this form. I feel we want to show them that via Loc-I, these transformations are possible in a backwards compatible way with data that they are downloading from other sources or that already exists in their workflow. My hunch is that the flow of URIs will primarily come at the data provider end, and at places where data may be warehoused internally in an organisation. It wouldn't be a big change to the data providers process. Once that happens, end-users will receive data with Loc-I URIs and then just use them. Worth discussing this further, but for now.
I don't feel like we're hiding this - as the converted data returns with Loc-I identifiers, so it's visible there and provides a point for educating end-users.
I think we need this tool too, but that is a bigger task.
Thanks. There is the future user study activity and engagement which will help inform our team on whether this is desireable... Explore the option and we can decide to keep/drop this then. |
Thanks @shaneseaton for putting this perspective. |
Tests using data excerpts from AURIN's aggregation to SA1 of Population & Dwelling Counts 2016 Census for Australia from ABS (original data - https://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/2074.02016?OpenDocument) ... is working well for excelerator input and conversions out to different geographies. |
This PR features enhancements to enable users to use a CSV file that has location codes as input rather than Loc-I URIs.
Often source data will have the location codes (i.e. ASGS 2016 MeshBlock codes) and values for that particular dataset. Currently users have to manually convert the values in the column with location codes to Loc-I URIs (e.g. from
50055290000
tohttp://linked.data.gov.au/dataset/asgs2016/meshblock/50055290000
). This is a barrier as users may not know how to manually do the string conversion.. but they should know which location type it is. So users selecting which dataset and datatype the location code are from the dropdown box in the UI is a more user-friendly way to go.This PR provides code that will prepend the Loc-I datatype prefix to the location code, so excelerator can map location code to Loc-I URI based on the prefixes for each dataset and dataset type.
Related to #29