|
127 | 127 | "\n",
|
128 | 128 | "Here, we use an **f-string** to combine the `station` variable (which takes on a value from the `new_stations` **list** on each pass through the loop) with `'data.csv'`, so that the resulting file names will be `'oxforddata.csv'`, `'southamptondata.csv'`, and `'stornowaydata.csv'`. We then use `Path()` to combine this with the `'data'` directory name, so that the value of `fn_data` is the complete relative path to each file.\n",
|
129 | 129 | "\n",
|
130 |
| - "Next, we use `pd.read_csv()` to read in the file, and add a `station` variable to the table, just like we did with the Armagh data. \n", |
| 130 | + "Next, we again use `pd.read_csv()` to read in the file, and add a `station` variable to the table, just like we did with the Armagh data. \n", |
131 | 131 | "\n",
|
132 | 132 | "Finally, we use `pd.concat()` ([documentation](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.concat.html)) to combine the existing table, `station_data`, with the newly loaded table (`data`), and overwrite the value of `station_data` with this combined table:\n",
|
133 | 133 | "\n",
|
|
136 | 136 | "\n",
|
137 | 137 | "```\n",
|
138 | 138 | "\n",
|
139 |
| - "Each time through the **for** loop, the value of `station` is updated:" |
| 139 | + "Remember that each time through the **for** loop, the value of `station` is updated - so on the first time through, the value of `station` will be `'oxford'`, on the second time through it will be `'southampton'`, and on the final time through it will be `'stornoway'`:" |
140 | 140 | ]
|
141 | 141 | },
|
142 | 142 | {
|
|
161 | 161 | "id": "325becd3-fb17-48c4-9314-59f225aa8239",
|
162 | 162 | "metadata": {},
|
163 | 163 | "source": [
|
164 |
| - "Note that this is one advantage of using clear, consistent naming and formatting for data files - we can easily write a loop to load multiple files, instead of having to write individual paths.\n", |
| 164 | + "Note that this is one advantage of using **clear, consistent naming and formatting for data files** - it means that we can easily write a loop to load multiple files, instead of having to write individual paths or treat each file differently!\n", |
165 | 165 | "\n",
|
166 | 166 | "## selecting rows using expressions\n",
|
167 | 167 | "\n",
|
|
185 | 185 | "id": "ff430bcf-9e94-4fb6-9931-6b1fde1ef64e",
|
186 | 186 | "metadata": {},
|
187 | 187 | "source": [
|
188 |
| - "If we want to use multiple conditions - for example, all observations where the monthly maximum temperature is greater than 20°C, and the monthly rainfall is grater than 100 mm, we can't simply use the `&` operator with the two statements:" |
| 188 | + "Remember that if we want to use multiple conditions - for example, all observations where the monthly maximum temperature is greater than 20°C, and the monthly rainfall is greater than 100 mm, we can't simply use the `&` operator with the two statements:" |
189 | 189 | ]
|
190 | 190 | },
|
191 | 191 | {
|
|
955 | 955 | "metadata": {},
|
956 | 956 | "outputs": [],
|
957 | 957 | "source": [
|
958 |
| - "sample = station_data \\\n", |
| 958 | + "station_data \\\n", |
959 | 959 | " .groupby('station') \\\n",
|
960 | 960 | " .sample(5)"
|
961 | 961 | ]
|
|
1004 | 1004 | "name": "python",
|
1005 | 1005 | "nbconvert_exporter": "python",
|
1006 | 1006 | "pygments_lexer": "ipython3",
|
1007 |
| - "version": "3.11.6" |
| 1007 | + "version": "3.10.15" |
1008 | 1008 | }
|
1009 | 1009 | },
|
1010 | 1010 | "nbformat": 4,
|
|
0 commit comments