-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Restructuring of project - Move definitions.py to /definitions folder - Update path in all files using definitions.py - Add anti-pattern markdown files * Add basic CLI output setup * Add argument to render descriptions * Add markdown descriptions of anti-patterns * Linting * Linting * Add general title to definitions * Add title from definitions to detector output * Fix typo * Add title to printer * Update docstring * Add description to detector output * Linting
- Loading branch information
1 parent
44caa85
commit a7ea4a9
Showing
23 changed files
with
505 additions
and
95 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,2 @@ | ||
sqlparse==0.4.2 | ||
sqlparse==0.4.2 | ||
rich==12.0.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,4 +2,5 @@ flake8==4.0.1 | |
tox==3.24.5 | ||
pytest==7.0.1 | ||
pytest-cov==3.0.0 | ||
mypy===0.931 | ||
types-setuptools==57.4.10 | ||
mypy==0.931 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
### Ambiguous Groups | ||
|
||
This anti-pattern occurs when developers misuse the aggregation command `GROUP BY`. | ||
|
||
Every column in a query's `SELECT` statement must have a single value row per row group, which is also known as the **Single-Value Rule**. Now, for columns in the `GROUP BY` aggregation this is guaranteed, because it returns exactly one value per group, regardless of how many rows the group matches. For other SQL commands such as `MAX(), MIN(), AVG()`, it will also result in a single value for each group, so this is also guaranteed. | ||
The database server, on the other hand, cannot be so certain about any other field listed in the `SELECT` statement. It cannot always ensure that the identical value for the other columns appears on every row in a group. This may cause erroneous results. | ||
|
||
#### Example code | ||
|
||
```SQL | ||
SELECT CID, PID, MIN(date) | ||
FROM customers JOIN shoppinglists USING (CID) | ||
GROUP BY CID; | ||
``` | ||
|
||
The code above shows a basic example of this anti-pattern. In this example, because the `shoppinglists` table identifies numerous products to a specific customer, there are several distinct values for product ID for a given customer ID. There is no way to express all product ID values in a grouping query that reduces to a single row per customer. | ||
|
||
#### Fix | ||
|
||
Always make sure that the columns in the `SELECT` clause have single values. This can be achieved by grouping over multiple columns. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
### Fear of the Unknown | ||
|
||
In SQL, values in columns can be left empty. This results in an attribute of a certain row having a `NULL` value. SQL considers `NULL` to be a special value, distinct from zero, false, true, or an empty string. Therefore, it is not possible to test for `NULL` values with standard comparison operators such as `=, >=, <>, etc`. Instead use `IS NULL` and `IS NOT NULL`. | ||
|
||
#### Example code | ||
|
||
```SQL | ||
SELECT pName, suffix | ||
FROM products | ||
WHERE suffix <> NULL; | ||
``` | ||
|
||
The code shown above is querying the product name and suffix columns from the products table where the suffix is not equal to `NULL`. One might think that this will result in all rows that have a suffix, however this is not the case. Any comparison to `NULL` returns _unknown_, not true or false. Therefore, this query does not return any data. | ||
|
||
#### Fix | ||
|
||
Use `IS NULL` and `IS NOT NULL` when comparing against `NULL` values. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
### Implicit Columns | ||
|
||
When writing a query that needs a lot of columns, developers often opt to use the SQL wildcard selector `*`. This means that every column from the table(s) specified in the `FROM` clause is returned, meaning that the list of columns is implicit instead of explicit. In many ways, this makes the query more concise. However, this can come at a cost as the result set can be quite big for large tables. This will have an impact on the performance of the query. | ||
|
||
#### Example code | ||
|
||
```SQL | ||
SELECT * | ||
FROM purchases; | ||
``` | ||
|
||
Suppose we have the task of finding all customer IDs, store IDs, product IDs as well as the quantity and price from the purchases table. This would mean that the only columns from the purchases table that are not present in this query would be the purchase ID and data columns. The query shown in above contains the implicit columns anti-pattern as it uses the wildcard `*`. Instead of selecting only the columns requested by the task, it utilizes a wildcard, returning in a result that includes both the store ID and the date columns. | ||
|
||
#### Fix | ||
|
||
Always explicitly select the columns you need. Use the wildcard `*` operator with caution. |
21 changes: 21 additions & 0 deletions
21
sqleyes/definitions/antipatterns/poor_mans_search_engine.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
### Poor Man's Search Engine | ||
|
||
Suppose we want to search for words or sentences in our database. The first thing that comes to mind is using a SQL pattern-matching predicate, such as the `LIKE` keyword, to which we can specify a pattern or using `REGEXP`. Both methods seem like a very good option for full searches. | ||
|
||
However, the main problem of pattern-matching predicates is their poor performance. Because they cannot use a traditional index, they must scan every row of the specified tables. The overall cost of a table scan for this search is very high, since matching a pattern against a column of strings is a costly operation when we compare it to other comparison methods like integer equality. | ||
|
||
Another problem with simple pattern-matching using the keyword `LIKE` or regular expressions is that they can find unintended matches, making the search result not accurate or erroneous. | ||
|
||
#### Example code | ||
|
||
```SQL | ||
SELECT * | ||
FROM products | ||
WHERE pName LIKE "%cat%" | ||
``` | ||
|
||
The code above shows an example of how to search products that have the word "cat" in their product name. This should be avoided if the products table is large. | ||
|
||
#### Fix | ||
|
||
Use a specialized search engine method to do pattern matching. They sometimes come as standard with certain databases or DBMS's. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
### Random Selection | ||
|
||
When writing a query that needs to select a random row from a table, developers might use `ORDER BY RAND() LIMIT 1`, where the `RAND()` function to sort the data randomly. However, this is not the best solution. By using the `RAND()` inside an `ORDER BY` clause, the use of an index is not possible, since there is no index containing the values returned by the random function. This is a big concern for the query's performance because using an index is one of the best ways to increase the computation of sorting. As a result of not employing an index, the query result set must be sorted by the database using a slow table scan, making the performance poor. | ||
|
||
#### Example code | ||
|
||
```SQL | ||
SELECT CID | ||
FROM customers | ||
ORDER BY RAND() LIMIT 1; | ||
``` | ||
|
||
A typical use case for this anti-pattern is when we have the task of selecting a random cusomter ID from the customers table. The query above shows a typical (faulty) solution. | ||
|
||
#### Fix | ||
|
||
Choose a random value using other means. Common ways would be to generate a random value between 1 and the greatest primary key, or counting the total number of rows and generating a random number between 0 and the row count. Then we can use the random number inside a `WHERE` clause. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
DEFINITIONS = { | ||
"anti_patterns": { | ||
"ambiguous_groups": { | ||
"filename": "ambiguous_groups.md", | ||
"title": "Incorrect GROUP BY usage", | ||
"type": "Ambiguous Groups" | ||
}, | ||
"fear_of_the_unknown": { | ||
"filename": "fear_of_the_unknown.md", | ||
"title": "Incorrect NULL usage", | ||
"type": "Fear of the Unknown" | ||
}, | ||
"implicit_columns": { | ||
"filename": "implicit_columns.md", | ||
"title": "Avoid usage of wildcard selector", | ||
"type": "Implicit Columns" | ||
}, | ||
"poor_mans_search_engine": { | ||
"filename": "poor_mans_search_engine.md", | ||
"title": "Avoid pattern matching", | ||
"type": "Poor Man's Search Engine" | ||
}, | ||
"random_selection": { | ||
"filename": "random_selection.md", | ||
"title": "Avoid ORDER BY RAND() usage", | ||
"type": "Random Selection" | ||
} | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.