Skip to content

Commit d75443e

Browse files
committed
Added all the necessary files
1 parent 19506a8 commit d75443e

20 files changed

+272
-2
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
.vscode/settings.json
539 KB
Loading

Images/optimal_skills.png

484 KB
Loading

Images/salary_trend_analysis.png

536 KB
Loading

Images/skill_tracking_premium.png

586 KB
Loading

Images/top_demanded_skills.png

469 KB
Loading

Images/top_paying_job_skills.png

494 KB
Loading

Images/top_paying_jobs.png

544 KB
Loading

Images/top_paying_skills.png

495 KB
Loading

Insights.xlsx

113 KB
Binary file not shown.

Project/1_top_paying_jobs.sql

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ Question: What are the top-paying data analyst jobs?
66
- Why? Highlight the top-paying opportunities for Data Analysts, offering insights into employment options and location flexibility.
77
*/
88

9+
-- Top 10 highest paying Data Analyst roles (Remote & India)
910
SELECT
1011
jpf.job_id,
1112
jpf.job_title,
@@ -122,6 +123,7 @@ Here's the result in JSON
122123
]
123124
*/
124125

126+
-- Top 10 highest paying Data Analyst roles (India)
125127
SELECT
126128
jpf.job_id,
127129
jpf.job_title,

README.md

Lines changed: 121 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,121 @@
1-
# SQL_Projects
2-
This is some text.
1+
# SQL Projects
2+
3+
# Table of Contents
4+
- [Dataset Overview](#dataset-overview)
5+
6+
- [Repository Structure](#repository-structure)
7+
- [Setup Instructions](#setup-instructions)
8+
9+
- [Prerequisites](#prerequisites)
10+
- [Installation Steps](#installation-steps)
11+
- [Troubleshooting](#troubleshooting)
12+
13+
- [Skills Required](#skills-required)
14+
- [Purpose](#purpose)
15+
- [Insights](#insights)
16+
17+
18+
# Dataset Overview
19+
This project utilizes a comprehensive dataset of global data job postings. To ensure complete reproducibility, all necessary files have been included in the repository.
20+
## Repository Structure
21+
The repository contains two main folders:
22+
- **csv_files**: Contains the raw dataset files
23+
- **sql_load**: Contains SQL scripts for table creation and data loading
24+
## Setup Instructions
25+
### Prerequisites
26+
- **PostgreSQL** (recommended for full compatibility)
27+
- **Excel** (For visualizations)
28+
- **Note**: While other RDBMS might work due to similar SQL syntax, PostgreSQL is preferred for guaranteed functionality
29+
## Installation Steps
30+
- Download the ZIP file from GitHub
31+
- Extract the contents to your desired location
32+
- Execute the SQL scripts located in the sql_load folder
33+
## Troubleshooting
34+
If you encounter permission-related issues during execution, refer to the detailed troubleshooting steps provided as comments within the SQL files.
35+
36+
</br>
37+
38+
# Skills Required
39+
To effectively understand and work with the queries in this project, you should be familiar with the following SQL concepts:
40+
- Basic SQL Statements
41+
- SQL Joins
42+
- Unions
43+
- Subqueries
44+
- Common Table Expressions (CTEs)
45+
- Window Functions
46+
- And more
47+
48+
</br>
49+
50+
# Purpose
51+
This project provides a collection of SQL queries designed to offer unique insights for **job seekers in the field of data analytics**. The queries enable users to explore the job market by analyzing key aspects such as top-paying jobs, high-demand skills, top-paying technical skills, salary trends, and more. Additionally, visualizations have been included to enhance understanding and make the data more accessible.
52+
53+
</br>
54+
55+
# Insights
56+
This section highlights key insights derived from the SQL queries, each supported by a visualization for better understanding. While not every possible permutation and combination is covered here, I highly encourage exploring the [SQL files](/Project/) directly for a more comprehensive view.
57+
[The Excel worksheet used for visualizations has been included for your reference.]
58+
59+
</br>
60+
61+
#### What are the top-paying data analyst jobs?
62+
<div align="center">
63+
<img src="/Images/top_paying_jobs.png" width="70%">
64+
<p>The role of a Data Analyst not only offers competitive compensation but also remains consistently in demand.</p>
65+
</div>
66+
67+
</br>
68+
69+
#### What skills are required for the top-paying data analyst jobs?
70+
<div align="center">
71+
<img src="/Images/top_paying_job_skills.png" width="70%">
72+
<p>While specialized skills often attract higher salaries, they come with trade-offs, such as limited job availability.</p>
73+
</div>
74+
75+
</br>
76+
77+
#### What are the most in-demand skills for data analysts?
78+
<div align="center">
79+
<img src="/Images/top_demanded_skills.png" width="70%">
80+
<p>Tools and languages like SQL, Excel, Python, Tableau, and Power BI continue to lead as the most in-demand skills in the industry.</p>
81+
</div>
82+
83+
</br>
84+
85+
#### What are the top skills based on salary?
86+
<div align="center">
87+
<img src="/Images/top_paying_skills.png" width="70%">
88+
<p>Similar to previous observations, niche technical skills consistently result in higher pay scales for professionals.</p>
89+
</div>
90+
91+
</br>
92+
93+
#### What are the most optimal skills to learn (aka it’s in high demand and a high-paying skill)?
94+
<div align="center">
95+
<img src="/Images/optimal_skills.png" width="70%">
96+
<p>Despite the complexity of the data, the conclusion remains steady: niche skills command a premium, while universally required skills like SQL and Excel offer solid but comparatively lower pay.</p>
97+
</div>
98+
99+
</br>
100+
101+
#### What is the month-over-month growth rate of average salaries for data analyst positions?
102+
<div align="center">
103+
<img src="/Images/salary_trend_analysis.png" width="70%">
104+
<p>Job listings see a noticeable spike mid-year before tapering off toward the end of the year.</p>
105+
</div>
106+
107+
</br>
108+
109+
#### How does each company's data analyst salary compare to the industry average, and which companies consistently pay above the market rate?
110+
<div align="center">
111+
<img src="/Images/company_compensation_comparison.png" width="70%">
112+
<p>Leading tech companies like Mantys, OpenAI, and Anthropic consistently offer salaries well above market standards, making them stand out as top-paying employers.</p>
113+
</div>
114+
115+
</br>
116+
117+
#### What is the salary premium for each skill compared to the running average of all data analyst positions?
118+
<div align="center">
119+
<img src="/Images/skill_tracking_premium.png" width="70%">
120+
<p>Across all visualizations, the trend is clear—niche skills lead to significantly higher salaries when compared to broader, in-demand technical skills.</p>
121+
</div>
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"name","avg_company_salary","overall_avg_salary","percentage_difference"
2+
"Mantys","650000","91931","607.05%"
3+
"ЛАНИТ","400000","91931","335.11%"
4+
"Torc Robotics","375000","91931","307.91%"
5+
"Illuminate Mission Solutions","375000","91931","307.91%"
6+
"Care.com","350000","91931","280.72%"
7+
"Anthropic","295000","91931","220.89%"
8+
"OpenAI","285000","91931","210.02%"
9+
"Google","254000","91931","176.29%"
10+
"Asana","235000","91931","155.63%"
11+
"Pinterest Job Advertisements","232423","91931","152.82%"
12+
"Genentech","230000","91931","150.19%"
13+
"CliftonLarsonAllen","225000","91931","144.75%"
14+
"GovCIO","225000","91931","144.75%"
15+
"MosaicML","220000","91931","139.31%"
16+
"Ball","218500","91931","137.68%"
17+
"Uclahealthcareers","217000","91931","136.05%"
18+
"F. Hoffmann-La Roche AG","215643","91931","134.57%"
19+
"Channel Personnel Services","210000","91931","128.43%"
20+
"Walmart Global Tech","203833","91931","121.72%"
21+
"ThinkingData","200000","91931","117.55%"
22+
"Airbnb","200000","91931","117.55%"
23+
"Snakorpio Group Inc.","200000","91931","117.55%"
24+
"Upstart","200000","91931","117.55%"
25+
"Withings","200000","91931","117.55%"
26+
"WINGS-ICT-SOLUTIONS","200000","91931","117.55%"

results_csv/optimal_skills.csv

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"skill_id","skill_name","skill_type","skill_count","avg_salary"
2+
0,"sql","programming","92628","96435"
3+
181,"excel","analyst_tools","67031","86419"
4+
1,"python","programming","57326","101512"
5+
182,"tableau","analyst_tools","46554","97978"
6+
183,"power bi","analyst_tools","39468","92324"
7+
5,"r","programming","30075","98708"
8+
186,"sas","analyst_tools","14034","93707"
9+
7,"sas","programming","14034","93707"
10+
196,"powerpoint","analyst_tools","13848","88316"
11+
188,"word","analyst_tools","13591","82941"
12+
189,"sap","analyst_tools","11297","92446"
13+
74,"azure","cloud","10942","105400"
14+
79,"oracle","cloud","10410","100964"
15+
76,"aws","cloud","9063","106440"
16+
61,"sql server","databases","8304","96191"
17+
8,"go","programming","7928","97267"
18+
215,"flow","other","7289","98020"
19+
22,"vba","programming","6870","93845"
20+
185,"looker","analyst_tools","6271","103855"
21+
80,"snowflake","cloud","6194","111578"
22+
187,"qlik","analyst_tools","5693","100933"
23+
4,"java","programming","5251","100214"
24+
92,"spark","libraries","5041","113002"
25+
233,"jira","async","4753","107931"
26+
199,"spss","analyst_tools","4711","85293"

results_csv/salary_trend_analysis.csv

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
"job_posted_month","avg_monthly_salary","prev_month_avg_salary","monthly_diff","monthly_diff_percentage"
2+
"1","92966","","",""
3+
"2","94979","92966","2013","2.17%"
4+
"3","93524","94979","-1455","-1.53%"
5+
"4","94807","93524","1283","1.37%"
6+
"5","94298","94807","-509","-0.54%"
7+
"6","97636","94298","3338","3.54%"
8+
"7","98138","97636","502","0.51%"
9+
"8","98105","98138","-33","-0.03%"
10+
"9","92074","98105","-6031","-6.15%"
11+
"10","90216","92074","-1858","-2.02%"
12+
"11","86854","90216","-3362","-3.73%"
13+
"12","86639","86854","-215","-0.25%"
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"skill_name","avg_skill_salary","overall_avg_salary","percentage_difference"
2+
"svn","400000","102020","292.08%"
3+
"solidity","179000","102020","75.46%"
4+
"couchbase","160515","102020","57.34%"
5+
"datarobot","155486","102020","52.41%"
6+
"golang","155000","102020","51.93%"
7+
"mxnet","149000","102020","46.05%"
8+
"dplyr","147633","102020","44.71%"
9+
"vmware","147500","102020","44.58%"
10+
"terraform","146734","102020","43.83%"
11+
"twilio","138500","102020","35.76%"
12+
"gitlab","134126","102020","31.47%"
13+
"kafka","129999","102020","27.43%"
14+
"puppet","129820","102020","27.25%"
15+
"keras","127013","102020","24.50%"
16+
"pytorch","125226","102020","22.75%"
17+
"perl","124686","102020","22.22%"
18+
"ansible","124370","102020","21.91%"
19+
"hugging face","123950","102020","21.50%"
20+
"tensorflow","120647","102020","18.26%"
21+
"cassandra","118407","102020","16.06%"
22+
"notion","118092","102020","15.75%"
23+
"atlassian","117966","102020","15.63%"
24+
"bitbucket","116712","102020","14.40%"
25+
"airflow","116387","102020","14.08%"
26+
"scala","115480","102020","13.19%"

results_csv/top_demanded_skills.csv

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"skill_id","skill_count","skills","type"
2+
0,"92628","sql","programming"
3+
181,"67031","excel","analyst_tools"
4+
1,"57326","python","programming"
5+
182,"46554","tableau","analyst_tools"
6+
183,"39468","power bi","analyst_tools"
7+
5,"30075","r","programming"
8+
186,"14034","sas","analyst_tools"
9+
7,"14034","sas","programming"
10+
196,"13848","powerpoint","analyst_tools"
11+
188,"13591","word","analyst_tools"

results_csv/top_paying_job_skills.csv

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
"job_id","job_title","company_name","job_schedule_type","job_location","avg_yearly_salary","skills"
2+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","sql"
3+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","python"
4+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","r"
5+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","azure"
6+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","databricks"
7+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","aws"
8+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","pandas"
9+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","pyspark"
10+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","jupyter"
11+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","excel"
12+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","tableau"
13+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","power bi"
14+
552322,"Associate Director- Data Insights","AT&T","Full-time","Anywhere","255829.5","powerpoint"
15+
99305,"Data Analyst, Marketing","Pinterest Job Advertisements","Full-time","Anywhere","232423.0","sql"
16+
99305,"Data Analyst, Marketing","Pinterest Job Advertisements","Full-time","Anywhere","232423.0","python"
17+
99305,"Data Analyst, Marketing","Pinterest Job Advertisements","Full-time","Anywhere","232423.0","r"
18+
99305,"Data Analyst, Marketing","Pinterest Job Advertisements","Full-time","Anywhere","232423.0","hadoop"
19+
99305,"Data Analyst, Marketing","Pinterest Job Advertisements","Full-time","Anywhere","232423.0","tableau"
20+
1021647,"Data Analyst (Hybrid/Remote)","Uclahealthcareers","Full-time","Anywhere","217000.0","sql"
21+
1021647,"Data Analyst (Hybrid/Remote)","Uclahealthcareers","Full-time","Anywhere","217000.0","crystal"
22+
1021647,"Data Analyst (Hybrid/Remote)","Uclahealthcareers","Full-time","Anywhere","217000.0","oracle"
23+
1021647,"Data Analyst (Hybrid/Remote)","Uclahealthcareers","Full-time","Anywhere","217000.0","tableau"
24+
1021647,"Data Analyst (Hybrid/Remote)","Uclahealthcareers","Full-time","Anywhere","217000.0","flow"

results_csv/top_paying_jobs.csv

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"job_id","job_title","company_name","job_schedule_type","avg_yearly_salary","job_posted_date"
2+
226942,"Data Analyst","Mantys","Full-time","650000.0","2023-02-20 15:13:33"
3+
547382,"Director of Analytics","Meta","Full-time","336500.0","2023-08-23 12:04:42"
4+
552322,"Associate Director- Data Insights","AT&T","Full-time","255829.5","2023-06-18 16:03:12"
5+
99305,"Data Analyst, Marketing","Pinterest Job Advertisements","Full-time","232423.0","2023-12-05 20:00:40"
6+
1021647,"Data Analyst (Hybrid/Remote)","Uclahealthcareers","Full-time","217000.0","2023-01-17 00:17:23"
7+
168310,"Principal Data Analyst (Remote)","SmartAsset","Full-time","205000.0","2023-08-09 11:00:01"
8+
731368,"Director, Data Analyst - HYBRID","Inclusively","Full-time","189309.0","2023-12-07 15:00:13"
9+
310660,"Principal Data Analyst, AV Performance Analysis","Motional","Full-time","189000.0","2023-01-05 00:00:25"
10+
1749593,"Principal Data Analyst","SmartAsset","Full-time","186000.0","2023-07-11 16:00:05"
11+
387860,"ERM Data Analyst","Get It Recruit - Information Technology","Full-time","184000.0","2023-06-09 08:01:04"

results_csv/top_paying_skills.csv

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
"skill_id","skill_name","skill_type","avg_salary"
2+
224,"svn","other","400000"
3+
38,"solidity","programming","179000"
4+
65,"couchbase","databases","160515"
5+
206,"datarobot","analyst_tools","155486"
6+
27,"golang","programming","155000"
7+
109,"mxnet","libraries","149000"
8+
119,"dplyr","libraries","147633"
9+
73,"vmware","cloud","147500"
10+
212,"terraform","other","146734"
11+
250,"twilio","sync","138500"

0 commit comments

Comments
 (0)