GitHub - UMassCDS/ds4cg2019-mapc: The Metropolitan Area Planning Council (MAPC) team worked on enhancing a forecasting technique originally developed at Northeastern University to project scenarios about future populations, enhancing its efficiency, ease of use, and breadth of applicability. The results will help MAPC to better serve the cities and towns that rely on their expertise for municipal planning.

Saurabh Shirodkar and Pratheek Mallya

DS4CG MAPC

This file describes the process of creating targets and executing the algorithm for the MAPC reweighting procedure.

The target tables should be entered into the file "reweighting_config.json". The format of the file can be understood from the following example:


{
    "file_name": "testhh.csv",
    "blocks" : [
        {
            "target_var":  "PWGTP",
            "special_cond_var": "none",
            "special_cond_target": "none",
            "tables": [
                {
                  "name": "FirstTable",
                  "dims": [
                    {
                      "var": "AGEP",
                      "type": "NUM",
                      "conditions": [
                        "x <= 2 ",
                        "2 < x & x <= 4",
                        "4 < x & x <= 6",
                        "x > 6"
                      ]
                    },
                    {
                        "var": "SEX",
                        "type": "NUM",
                        "conditions": [
                            "x == 1",
                            "x == 2"
                        ]
                    }
                  ]
                }
            ]
        },
        {
            "target_var": "WGTP",
            "special_cond_var": "SPORDER",
            "special_cond_target": "1",
            "tables": [
                {
                    "name": "SecondTable",
                    "dims": [
                        {
                            "var": "ADJINC",
                            "type": "NUM",
                            "conditions": [
                                "x == 3",
                                "x == 4",
                                "x == 5",
                                "x == 6"
                            ]
                        },
                        {
                            "var": "HHTYPE",
                            "type": "NUM",
                            "conditions": [
                                "x == 1",
                                "x == 2",
                                "x > 3"
                            ]
                        }
                    ]
                }
            ]
        }
    ]
}

"file_name" specifies the input CSV file (the PUMS dataset).

Each block represents a different target variable, along with any special conditions to be imposed on the block.

"target_var" specifies the name of the target variable for the tables in the particular block.

"special_cond_var" specifies the name of the variable under which any special filtering condition needs to be applied.

"special_cond_target" specifies the value of the special condition variable for filtering.

Each block consists of a set of tables.

Each object in the "tables" array represents a table. There can be an arbitrary amount of tables.

For each table, the "name" field indicates the name of the table. Each object of the "dims" array indicates one variable or dimension. The "var" field of a variable should be a header in the input CSV file. The "type" field can take values "NUM" or "CHAR" and denotes the type of the variable. The "conditions" array is a set of conditions that represent the columns of the table corresponding to the varaible. Each of these should be a logical expression (that evaluates to a boolean) that can be parsed by the R interpreter. The variable in this expression should be denoted by the letter 'x'.

The number of dimensions, as well as the number of conditions within each dimension, can be arbitrarily large. In the above example, the dimension of "FirstTable" is 3x2x2 while that of "SecondTable" is just 3.

After specifying "reweighting_config.json" in this format, run the script "read_config.R". Corresponding to each table, a CSV file will be generated with the same name as that table. Open the CSV file for each table (using a software such as MS Excel or LibreOffice Calc, or a text editor) and fill in the target values corresponding to each cell of the target table.

After specifying the target tables, run the script "test.R". The number of iterations can be modified at the top of the script. At the end of the script, the output weights will be written to the file "weights_final.csv".

Dependencies: R Packages:

tidyverse
rjson
data.table

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
data		data
unused		unused
.gitignore		.gitignore
README.md		README.md
base_algo.R		base_algo.R
base_algo_hh.R		base_algo_hh.R
objective.R		objective.R
read_config.R		read_config.R
reweighting_config.json		reweighting_config.json
test.R		test.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 3

Languages

UMassCDS/ds4cg2019-mapc

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages