To predict the presence of rocky terrain using topography and remote sensing data. This information includes:
-
Topography Features (aspect, elevation, flow length, plan curvature, profile curvature, slope, tan curvature, twi)
-
Remote Sensing Features (vegetation index, moisture index, bulk density, soil organic carbon, clay, sand)
Note: Every coordinate is considered in isolation and hence this model follows a point-based approach.
- Clone this repository.
git clone https://github.com/SoilForestHealth/rocky-terrain-classifier.git- Install all packages in the
requirements.txtfile.
pip install -r requirements.txt- The following directory structure is required for the code in this repository to work properly:
├── data
│ ├── LSF_Grid_Soil_Data_2025_Summer.xlsx
│ ├── LSF_Topography_Covariates_2025_Summer.csv
│ ├── metadata
│ │ ├── data.json
│ │ ├── system.json
│ │ └── tune.json
│ ├── raw_field_summer_2025_covariates_combined.csv
│ └── s2_cloudless_covariates_field_summer_2025_combined.csv
├── main.py
├── pipeline
│ ├── evaluate.py
│ ├── model.py
│ ├── preprocess.py
│ └── select.py
└── requirements.txt
4 directories, 13 files- To execute the code in this repository, run
main.pyfile. Ensure you are in the root directory of the repository.
python3 main.py- Feel free to raise an issue if there are any problems with the repository!
To evaluate the models for the imbalanced binary classification problem, we use macro_avg_recall and f2_score.
| Model | macro_avg_recall | f2_score | wandb sweeps |
|---|---|---|---|
| Logistic Regression | 0.557 | 0.557 | wandb |
| Decision Tree | 0.654 | 0.658 | wandb |
| Random Forest | 0.723 | 0.731 | wandb |
| Extra Trees | 0.696 | 0.700 | wandb |
| Gradient Boosting Trees | 0.724 | 0.724 | wandb |
The overall model comparison dashboard is here.