Click here for more information
- Introduction
- Development process
- Inputs and outputs of a knowledge discovery algorithm
- Recent updates
- Features
- Installing GDAL
- Maintenance
- Try your first geoAnalytics program
- License
- Documentation
- Getting Help
- Discussion and Development
- Contribution to geoAnalytics
- Tutorials
- Real-World Case Studies
geoAnalytics is an open-source Python-based Machine Learning library developed to discover various forms of useful information hidden in the raster data. The algorithms provided in this library cover a wide-spectrum of machine learning tasks, such as imputation, image fusion, clustering, classification, one class classification, and pattern mining. This library being platform independent can run any operating system.
- Version 2025.7.2:
In this latest version, the following updates have been made:
- Included new algorithms in patternMining, CorrelatedPatternMining, CoveragePatternMining, FaultTolerantFrequentPatternMining, FrequentPatternMining, LocalPeriodicPatternMining, PartialPeriodicFrequentPatternMining, PartialPeriodicPatternMining, PartialPeriodicPatternInMultipleTimeSeries, PeriodicCorrelatedPatternMining, RecurringPatternMining, RelativeFrequentPatternMining, and PeriodicFrequentPatternMining for identifying the patterns.
- Included new algorithms in normalization.
- Test cases are implemented using geoanalytics package.
Total number of algorithms: 50+
- ✅ Tested to the best of our possibility
- 🔋 Highly optimized to our best effort, light-weight, and energy-efficient
- 👀 Proper code documentation
- 🍼 Sample examples of using various algorithms at ./tests folder
- 🤖 Works with AI libraries such as TensorFlow, PyTorch, and sklearn.
- ⚡️ Supports Cuda
- 🖥️ Operating System Independence
- 🔬 Knowledge discovery in static data and streams
- 🐎 Snappy
- 🐻 Ease of use
GDAL
is an important toolkit in our library. It is for converting the raster data in any format into a human readable text or CSV format.
We have present the methods to install this toolkit using Conda environment on a machine running Ubuntu operating system.
sudo apt-get update && sudo apt upgrade -y && sudo apt autoremove
sudo apt-get install -y cdo nco gdal-bin libgdal-dev
pip install --global-option=build_ext --global-option="-I/usr/include/gdal" GDAL==`gdal-config --version`
python -m pip install --upgrade pip setuptools wheel
python -m pip install --upgrade gdal
If the above two commands have failed to install gdal, then execute the following commands:
conda install -c conda-forge libgdal
conda install -c conda-forge gdal
conda install tiledb=2.2
conda install poppler
Once the above commands were executed, check the version information by typing the following command on the terminal
:
ogrinfo --version
-
Install and set up Anaconda. URL: https://linuxize.com/post/how-to-install-anaconda-on-centos-7
-
Create a virtual environment using conda. E.g., coda create --name geoAnalytics
-
Enter into virtual environment. E.g., conda activate geoAnalytics
-
Install python. E.g., conda install python
-
Install pycharm from the website
-
Open Pycharm and using VCS download the latest copy of geoAnalytics from GitHub
-
In the pycharm, add geoAnalytics as the interpreter
-
Open the terminal in pycharm, and execute the following command
pip install mplcursors matplotlib sklearn pandas
Installation
pip install geoAnalytics
Upgradation
pip install --upgrade geoAnalytics
Uninstallation
pip uninstall geoAnalytics
Information
pip show geoAnalytics
$ python
$ !apt update
$ !apt install -y nco cdo gdal-bin
$ !which ncrename
Output: /usr/bin/ncrename
$ !which cdo
Output: /usr/bin/cdo
$ !which gdal_translate
Output: /usr/bin/gdal_translate
!pip install -U geoanalytics
pip show geoanalytics
Output:
Name: geoanalytics
Version: 2025.6.10.3
Summary: This software is being developed at the University of Aizu, Aizu-Wakamatsu, Fukushima, Japan
Home-page: https://github.com/UdayLab/geoanalytics
Author:
Author-email: Rage Uday Kiran <[email protected]>
License: GPLv3
Location: /usr/local/lib/python3.11/dist-packages
Requires: deprecated, discord.py, fastparquet, matplotlib, mplcursors, networkx, numba, numpy, pandas, Pillow, plotly, psutil, psycopg2-binary, resource, scikit-learn, shapely, sphinx, sphinx-rtd-theme, tqdm, urllib3, validators
Required-by:
Example Direction : --> https://data.darts.isas.jaxa.jp/pub/pds3/sln-l-mi-5-map-v3.0/lon042/data/
from geoanalytics.conversion import Raster2CSV
Step2: Pass the lbl
file as input and give desired outputFile
name also specify the inputBand
value as well as ouputBand
value
converter = Raster2CSV.Raster2CSV(inputFile='MI_MAP_03_S16E035S17E036SC.lbl', outputFile='Moon.csv', startBand=1, endBand=9)
converter.run()
Output:
Processing: MI_MAP_03_S16E035S17E036SC.lbl
Done. Output saved to: Moon.csv
import pandas as pd
df = pd.read_csv('Moon.csv', sep='\t')
df
x | y | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 |
---|---|---|---|---|---|---|---|---|---|---|
1061317.265 | -485173.607 | 1928 | 3446 | 3859 | 3913 | 4026 | 3999 | 4236 | 5245 | 6513 |
1061332.071 | -485173.607 | 1924 | 3480 | 3876 | 3930 | 4059 | 3996 | 4243 | 5234 | 6518 |
1061346.877 | -485173.607 | 1904 | 3476 | 3834 | 3923 | 4047 | 3992 | 4238 | 5222 | 6523 |
1061361.684 | -485173.607 | 1874 | 3452 | 3801 | 3897 | 3959 | 3988 | 4228 | 5210 | 6518 |
1061376.490 | -485173.607 | 1907 | 3464 | 3777 | 3868 | 3974 | 3983 | 4218 | 5198 | 6504 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1091566.583 | -515482.151 | 2090 | 3719 | 4007 | 3996 | 4117 | 4056 | 4310 | 5374 | 6633 |
1091581.390 | -515482.151 | 2098 | 3734 | 4038 | 4020 | 4177 | 4083 | 4327 | 5389 | 6659 |
1091596.196 | -515482.151 | 2114 | 3767 | 4046 | 4040 | 4213 | 4110 | 4332 | 5393 | 6685 |
1091611.002 | -515482.151 | 2123 | 3813 | 4019 | 4056 | 4214 | 4136 | 4331 | 5397 | 6711 |
1091625.809 | -515482.151 | 2125 | 3784 | 4011 | 4051 | 4184 | 4136 | 4329 | 5402 | 6737 |
4194304 rows × 11 columns |
!pip install fuzzy-c-means
from geoanalytics.clustering import FuzzyCMeans
obj = FuzzyCMeans.FuzzyCMeans(dataframe=df)
labels, centers = obj.run(n_clusters=4)
labels
x | y | labels |
---|---|---|
1061317.265 | -485173.607 | 2 |
1061332.071 | -485173.607 | 2 |
1061346.877 | -485173.607 | 2 |
1061361.684 | -485173.607 | 2 |
1061376.490 | -485173.607 | 2 |
... | ... | ... |
1091566.583 | -515482.151 | 2 |
1091581.390 | -515482.151 | 0 |
1091596.196 | -515482.151 | 0 |
1091611.002 | -515482.151 | 0 |
1091625.809 | -515482.151 | 0 |
4194304 rows × 3 columns |
centers
Output:
array([[2098.51620472, 3762.45416178, 4099.00372289, 4146.23025903,
4295.80971825, 4229.78337628, 4443.6879668 , 5471.85298977,
6790.96720993],
[2274.54605523, 4040.75198398, 4374.89922669, 4395.57058102,
4538.50885801, 4454.45212911, 4677.95182781, 5739.96854105,
7095.06428559],
[1984.82196498, 3578.0340764 , 3908.79548852, 3961.66894715,
4108.90409919, 4049.30456217, 4255.81099407, 5258.24055464,
6549.13133772],
[1896.71176664, 3421.42972113, 3741.56611478, 3802.88625245,
3946.94784255, 3896.3217178 , 4093.18671711, 5066.07543066,
6327.35005223]])
obj.getRuntime()
obj.getMemoryRSS()
obj.getMemoryUSS()
Output:
Total Execution time of proposed Algorithm: 407.7022657394409 seconds
Memory (RSS) of proposed Algorithm in KB: 1655512.0
Memory (USS) of proposed Algorithm in KB: 1634900.0
obj.save(outputFileLabels='FuzzyCMeansLabels.csv', outputFileCenters='FuzzyCMeansCenters.csv')
Output:
Labels saved to: FuzzyCMeansLabels.csv
Cluster centers saved to: FuzzyCMeansCenters.csv
from geoanalytics.conversion import CSV2Raster as CSV2Raster
process = CSV2Raster.CSV2Raster(dataframe=labels,outputFile='FuzzyCMeans.tiff')
process.run()
Output:
(0, '')
(139, 'Segmentation fault (core dumped)')
(0, '')
(0, 'Input file size is 2048, 2048\n0...10...20...30...40...50...60...70...80...90...100 - done.')
(0, '')
pip install rasterio
from geoanalytics.visualization import TiffViewer
viewer = TiffViewer.TiffViewer(inputFile='FuzzyCMeans.tiff')
viewer.run(cmap='gray', title='TIFF Image')
viewer.run(cmap='jet', title='TIFF Image')
The official documentation is hosted on geoAnalytics.
For any queries, the best place to go to is Github Issues Github Issues.
In our GitHub repository, the primary platform for discussing development-related matters is the university lab. We encourage our team members and contributors to utilize this platform for a wide range of discussions, including bug reports, feature requests, design decisions, and implementation details.
We invite and encourage all community members to contribute, report bugs, fix bugs, enhance documentation, propose improvements, and share their creative ideas.
Conversion |
---|
CSV To Raster |
Raster To CSV |
Imputation |
---|
Backward Fill |
Forward Fill |
Interpolation |
Mean Imputation |
Median Imputation |
Mode Imputation |
Number Imputation |
Soft Imputation |
Clustering |
---|
KMeans |
KMeansPP |
MEANshift |
AffinityPropagation |
Agglomerative |
DBScan |
FuzzyCMeans |
Gaussianmixture |
HDBScan |
OpticsClustering |
Spectral |
Pattern Mining |
---|
Frequent Pattern Mining |