This repository contains an exploratory data analysis (EDA) of a real estate dataset. The analysis includes data cleaning, visualization, and insights into various aspects of the real estate market across different cities.
The dataset used in this analysis contains information about real estate properties from various cities. The key columns in the dataset are:
city
: The city where the property is locatedbedroom
: The number of bedrooms in the propertyfacing
: The direction the property is facingtotal_area
: The total area of the propertyPrice
: The price of the property
The analysis is divided into several sections:
- Checking for null values and ensuring the data is clean.
- Displaying basic statistics of the dataset.
- A histogram showing the number of houses available for sale in different cities.
- Insight: Most houses available for sale are in Delhi compared to other cities.
- A pie chart showing the distribution of houses based on the direction they are facing.
- Insight: The direction of houses is an important factor in the Indian real estate market.
- A histogram showing the distribution of houses based on the number of bedrooms.
- Insight: Houses with 3 bedrooms are the most common, followed by houses with 2 and 4 bedrooms.
- A scatter plot comparing the total area and price of houses in different cities.
- Insight: Provides a visual comparison of the cost of houses relative to their area across various cities.
The analysis includes various visualizations to aid in understanding the dataset:
- Histograms
- Pie charts
- Scatter plots
To run the analysis, you need to have Python installed along with the following libraries:
- pandas
- numpy
- matplotlib
- seaborn
You can install these libraries using pip:
pip install pandas numpy matplotlib seaborn
Load the dataset and run the analysis by executing the cells in the Jupyter Notebook.
This exploratory data analysis provides insights into the real estate market across different cities. It highlights key factors such as the supply of houses, distribution by direction, and the relationship between area and price.
This project is licensed under the MIT License.
- The dataset was provided for educational purposes.
- The analysis was conducted using Python and various data visualization libraries.
For any queries or contributions, feel free to open an issue or submit a pull request.