-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* General setup for Spaghetti Query detector * Setup Halstead Metric and function for complexity * Add function to get all columns from ORDER BY * Add tests for the get_all_columns method * Linting * Add description * Add sql metadata package * Add method to format query * Add function to determine whether query has subqueries * Add distiction between subquery and UNION * Add function that checks if the query has a UNION * Change has_subqueries to look at (SELECT instead of just SELECT * Add method that gets all queries in a union sql query * Add function to get SQL operators * Add type hints to halstead metric function * Implement halstead complexity * - Add spaghetti query detector to detector - Fix error when query is not defined * Add spaghetti query anti-pattern detector * Relax thresholds for spaghetti query * Relax thresholds for spaghetti query again * Change printer to only output location when provided * Add check to not print descriptions if there are no descriptions * Update test cases * Linting and MyPy
- Loading branch information
1 parent
1ea7233
commit bb4f3a5
Showing
11 changed files
with
597 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
sqlparse==0.4.2 | ||
sql-metadata==2.4.0 | ||
rich==12.0.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
### Spaghetti Query | ||
|
||
Queries can be of varying degrees of difficulty. Some significantly more sophisticated than others, such as a complex join between two databases or a recursive subqueries. Sometimes, during development of a query for a complex task, the query becomes too complex that the programmer gets stuck. This is most likely because programmers are fixated on solving the task both elegantly and efficiently, thus they try to complete it with a single query. However, the complexity of these single queries can increase exponentially, making both maintainability and correctness more difficult to achieve. | ||
|
||
#### Example code | ||
|
||
```SQL | ||
SELECT COUNT(p.pID) AS numberOfDistINctProducts, | ||
SUM(i.quantity) AS numberOfProducts, | ||
AVG(i.unit_price) AS averagePrice, | ||
city | ||
FROM products p | ||
JOIN inventories i ON (p.pID = i.pID) | ||
JOIN stores s ON (i.sID = s.sID) | ||
GROUP BY s.city | ||
``` | ||
|
||
The query above can be considered overly complex for what it does, but it demonstrates the type of problem that can occur when a programmer tries to solve a complicated problem in one query. SQL is a sophisticated language that allows you to do a great deal with a single query or statement. However, this does not mean that it is essential to try to solve every problem with a single query or line of code. | ||
|
||
#### Fix | ||
|
||
Sometimes, it is better to write seperate queries for a certain task, then trying to accomplish it with one query. A simple way to tackle complex queries is to use the **divide and conquer** method, where you divide the problem into multiple parts so you can solve them independently. In other words, if you break up a long complex query into several simpler queries, you can then focus on each part individually and do a better job of each of them, since they are less complex. While it is not always possible to split a query this way, it is a good general strategy, which is often all that is necessary. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
"""Implicit Columns anti-pattern detector class""" | ||
from sqleyes.definitions.definitions import DEFINITIONS | ||
from sqleyes.detector.antipatterns.abstract_base_class import AbstractDetector | ||
from sqleyes.detector.detector_output import DetectorOutput | ||
from sqleyes.utils.query_functions import get_query_complexity | ||
|
||
|
||
class SpaghettiQueryDetector(AbstractDetector): | ||
|
||
filename = DEFINITIONS["anti_patterns"]["spaghetti_query"]["filename"] | ||
type = DEFINITIONS["anti_patterns"]["spaghetti_query"]["type"] | ||
title = DEFINITIONS["anti_patterns"]["spaghetti_query"]["title"] | ||
|
||
def __init__(self, query): | ||
super().__init__(query) | ||
|
||
def check(self): | ||
LOW_THRESHOLD = 60 | ||
MEDIUM_THRESHOLD = 75 | ||
HIGH_THRESHOLD = 90 | ||
|
||
query_complexity = get_query_complexity(self.query) | ||
|
||
if query_complexity < LOW_THRESHOLD: | ||
return None | ||
|
||
if LOW_THRESHOLD < query_complexity < MEDIUM_THRESHOLD: | ||
certainty = "low" | ||
|
||
if MEDIUM_THRESHOLD < query_complexity < HIGH_THRESHOLD: | ||
certainty = "medium" | ||
|
||
if HIGH_THRESHOLD < query_complexity: | ||
certainty = "high" | ||
|
||
return DetectorOutput(query=self.query, | ||
certainty=certainty, | ||
description=super().get_description(), | ||
detector_type=self.detector_type, | ||
locations=[], | ||
title=self.title, | ||
type=self.type) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
"""Utility functions w.r.t code complexity metrics""" | ||
import math | ||
|
||
|
||
def halstead_metrics(n1: int, n2: int, N1: int, N2: int): | ||
""" | ||
Compute Halstead metrics for a given program. | ||
Parameters: | ||
n1 (int): Number of unique operators in the query. | ||
n2 (int): Number of unique operands in the query. | ||
N1 (int): Number of operators in the query. | ||
N2 (int): Number of operands in the query. | ||
Returns: | ||
N (int): Program length. | ||
n (int): Program vocabulary. | ||
V (float): Program volume. | ||
D (float): Program difficulty. | ||
E (float): Program effort. | ||
""" | ||
# Program length | ||
N = N1 + N2 | ||
|
||
# Program vocabulary | ||
n = n1 + n2 | ||
|
||
# Volume | ||
V = N * math.log2(n) | ||
|
||
# Difficulty | ||
D = n1/2 * N2/n2 | ||
|
||
# Effort | ||
E = D * V | ||
|
||
return (N, n, V, D, E) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.