Skip to content

Commit 205ed18

Browse files
author
Chris Massey
committed
Fixed TOC, added new section
Fixed the TOC & headers, added internal linking, and added the "Problems with Database Design" section
1 parent b76e6c6 commit 205ed18

File tree

1 file changed

+54
-27
lines changed

1 file changed

+54
-27
lines changed

README.md

Lines changed: 54 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,19 @@ SQL code smells
55
66
#Contents
77

8-
- Introduction
9-
- Problems with Database
10-
- Design
11-
- Problems with Table
12-
- Design
13-
- Problems with Data Types
14-
- Problems with Expressions
15-
- Difficulties with Query
16-
- Syntax
17-
- Problems with Naming
18-
- Problems with Routines
19-
- Security Loopholes
20-
- Acknowledgements
21-
22-
#Introduction
8+
- [Introduction](Introduction)
9+
- [Problems with Database](Problems_With_Database_Design)
10+
- [Problems with Table Design](Problems_with_Table_Design)
11+
- [Problems with Data Types](Problems_with_Data_Types)
12+
- [Problems with Expressions](Problems_with_Expressions)
13+
- [Difficulties with Query Syntax](Difficulties_with_Query_Syntax)
14+
- [Problems with Naming](Problems_with_Naming)
15+
- [Problems with Routines](Problems_with_Routines)
16+
- [Security Loopholes](Security_Loopholes)
17+
- [Acknowledgements](Acknowledgements)
18+
19+
<a name="Introduction"></a>
20+
#Introduction
2321
**Once you’ve done a number of SQL code-reviews, you’ll be able to identify signs in the code that indicate all might not be well. These ‘code smells’ are coding styles that, while not bugs, suggest design problems with the code.**
2422

2523
Kent Beck and Massimo Arnoldi seem to have coined the term ‘CodeSmell’ in the ‘[Once And Only Once](http://www.c2.com/cgi/wiki?OnceAndOnlyOnce)’ page of www.C2.com, where Kent also said that code ‘wants to be simple’. Kent Beck and Martin Fowler expand on the issue of code challenges in their essay ‘Bad Smells in Code’, published as Chapter 3 of the book ‘Refactoring: Improving the Design of Existing Code’ (ISBN 978-0201485677).
@@ -34,15 +32,44 @@ In describing all these 119 code-smells in a booklet, I’ve been very constrain
3432

3533
-*Phil Factor, _Contributing Editor_*
3634

37-
#Problems with Database
38-
#Design
39-
#Problems with Table
40-
#Design
41-
#Problems with Data Types
42-
#Problems with Expressions
43-
#Difficulties with Query
44-
#Syntax
45-
#Problems with Naming
46-
#Problems with Routines
47-
#Security Loopholes
48-
#Acknowledgements
35+
#Problems with Database Design <a name="Problems_With_Database_Design"></a>
36+
##1) Packing lists, complex data, or other multivariate attributes into a table column
37+
38+
It is permissible to put a list or data document in a column only if it is, from the database perspective, ‘atomic’, that is, never likely to be shredded into individual values; in other words, as long as the value remains in the format in which it started. We store strings, after all, and a string is hardly atomic since it consists of an ordinally significant collection of characters or words. A list or XML value stored in a column, whether by character map, bitmap or XML data type, can be a useful temporary expedient during development, but the column will likely need to be normalized if values will have to be shredded.
39+
40+
A related code smell is:
41+
###Using inappropriate data types
42+
43+
Although a business may choose to represent a date as a single string of numbers or require codes that mix text with numbers, it is unsatisfactory to store such data in columns that don’t match the actual data type. This confuses the presentation of data with its storage. Dates, money, codes and other business data can be represented in a human-readable form, the ‘presentation’ mode, they can be represented in their storage form, or in their data-interchange form.
44+
45+
Storing data in the wrong form as strings leads to major issues with coding, indexing, sorting, and other operations. Put the data into the appropriate ‘storage’ data type at all times.
46+
47+
##2) Storing the hierarchy structure in the same table as the entities that make up the hierarchy
48+
49+
Self-referencing tables seem like an elegant way to represent hierarchies. However, such an approach
50+
mixes relationships and values. Real-life hierarchies need more than a parent-child relationship. The ‘Closure Table’ pattern, where the relationships are held in a table separate from the data, is much more suitable for real-life hierarchies. Also, in real life, relationships tend have a beginning and an end, and this often needs to be recorded. The HIERARCHYID data type and the common language runtime (CLR) SqlHierarchyId class are provided to make tree structures represented by self-referencing tables more efficient, but they are likely to be appropriate for only a minority of applications.
51+
52+
##3) Using an Entity Attribute Value (EAV) model
53+
The use of an EAV model is almost never justified and leads to very tortuous SQL code that is extraordinarily difficult to apply any sort of constraint to. When faced with providing a ‘persistence layer’ for an application that doesn’t understand the nature of the data, use XML instead. That way, you can use XSD to enforce data constraints, create indexes on the data, and use XPath to query specific elements within the XML. It is then, at least, a reliable database, even though it isn’t relational!
54+
55+
##4) Using a polymorphic association
56+
Sometimes, one sees table designs which have ‘keys’ that can reference more than one table, whose identity is usually denoted by a separate column. This is where an entity can relate to one of a number of different entities according to the value in another column that provides the identity of the entity. This sort of relationship cannot be subject to foreign key constraints, and any joins are difficult for the query optimizer to provide good plans for. Also, the logic for the joins is likely to get complicated. Instead, use an intersection table, or if you are attempting an object-oriented mapping, look at the method by which SQL Server represents the database metadata by creating an ‘object’ supertype class that all of the individual object types extend. Both these devices give you the flexibility of design that polymorphic associations attempt.
57+
58+
##5) Creating tables as ‘God Objects’
59+
‘God Tables’ are usually the result of an attempt to encapsulate a large part of the data for the business domain in a single wide table. This is usually a normalization error, or rather, a rash and over-ambitious attempt to ‘denormalize’ the database structure. If you have a table with many columns, it is likely that you have come to grief on the third normal form. It could also be the result of believing, wrongly, that all joins come at great and constant cost. Normally they can be replaced by views or table-valued functions. Indexed views can have maintenance overhead but are greatly superior to denormalization.
60+
61+
##6) Contrived interfaces
62+
Quite often, the database designer will need to create an interface to provide an abstraction layer between schemas within a database, between database and ETL processes, or between a database and application. You face a choice between uniformity, and simplicity. Overly complicated interfaces, for whatever reason, should never be used where a simpler design would suffice. It is always best to choose simplicity over conformity. Interfaces have to be clearly documented and maintained, let alone understood.
63+
64+
##7) Using command-line and OLE automation to access server-based resources
65+
In designing a database application, there is sometimes functionality that cannot be done purely in SQL, usually when other server-based, or network-based resources must be accessed. Now that SQL Server’s integration with PowerShell is so much more mature, it is better to use that, rather than xp_cmdshell or sp_OACreate (or similar), to access the file system or other server-based resources. This needs some thought and planning. You should also use SQL Agent jobs when possible to schedule your server-related tasks. This requires up-front design to prevent them becoming unmanageable monsters prey to ad-hoc growth.
66+
67+
68+
#Problems with Table Design <a name="Problems_with_Table_Design"></a>
69+
#Problems with Data Types <a name="Problems_with_Data_Types"></a>
70+
#Problems with Expressions <a name="Problems_with_Expressions"></a>
71+
#Difficulties with Query Syntax <a name="Difficulties_with_Query_Syntax"></a>
72+
#Problems with Naming <a name="Problems_with_Naming"></a>
73+
#Problems with Routines <a name="Problems_with_Routines"></a>
74+
#Security Loopholes <a name="Security_Loopholes"></a>
75+
#Acknowledgements <a name="Acknowledgements"></a>

0 commit comments

Comments
 (0)