Note: This standard is still in the process of being written: <Under Construction>. Click here, for the older version.
To find a comprehensive guide on SQL Server Database Coding Standards, is not easy. Many contributions are scattered across the internet, and most companies collate them into a proprietary held manifest, to which we have no access.
The purpose of a database coding standard would be to help database, and application developers, to firstly conform their code-style, and code-practices, and secondly create a maintainable solution. It would even be a good thing to see people adopt a reasonably uniform style, on an international basis. When browsing the usual sites, won’t it be grand to see database professionals coding SQL universally conformal.
When reading the book, Professional SQL Server 2008 Internals and Troubleshooting (Christian Bolton, Justin Langford, Brent Ozar, James Rowland-Jones, Steven Wort, and Jonathan Kehayias), I was pleasantly surprised at the SQL code-style. Why, because it looks like a style I prefer. The book is not consistent throughout, probably because of the different authors, so let’s get into the detail of SQL formatting.
In the SQL code-example image (SQL Code-style Sample), the first code block, is from p47, fig. 2-16 of the book. This is a beautiful layout. Take note of the capitalised key-words, the indent between them, and the actual T-SQL code, as well as the closing of the statement with a semi-colon. The only things I would like to change are:
SQL Code-style Sample
- A standard intent, between key-word (SELECT) and the T-SQL code, of three tabs. This is to anticipate JOINs, which could be comprised of three words: LEFT OUTER JOIN. The three tabs, allow for a seamless fit.
- Moving the commas to the front, i.e. those separating columns in the SELECT statement. This is a personal preference, because each comma can be aligned that way; whereas, if it is placed after the line they align according to the line’s length, and is more often than not, ragged.
The last example (in the image insert), often found as you browse the internet, is the most loathsome form of SQL, in my opinion. It makes me cringe. However, despite my personal quirks, it does not look too abominable. As seen, it has structure, and it is very readable. But, as we progress with this coding standard, it will become very evident that the longer, and more complex the code becomes, the more crucial a good choice of layout is.
Of course, style is an aspect of a database coding standard, there are a few other things that should also be considered.
This article is based on Microsoft SQL Server, and Transact-SQL, but that is of little importance, as most of the concepts can be equally applied to other database technologies. Where proprietary language constructs differ, the principals governing style, layout and capitalisation, are encompassing.
A standard should address at least, the following things:
- Naming Standards
- Database Objects
- Databases, Files, and File Groups
- Project specific naming variations
- Formatting, and capitalisation
- Variable declarations, and initialisations
- Object comment decorations
- Error handling
- Project, or Enterprise specific
- C# CLR
- Enterprise best-practices, for database developers
- Enterprise database development templates (boiler-plate code)
- Files, and File Groups
- Storage architectures, and implementation guide-lines
- Data types
- Dynamic SQL
- Transaction Isolation
- Execution Plans, Statistics IO, and Time
- Query Hints
- Referential Integrity
- Temporary Tables
- Visual Studio Database Projects
- Configuring, and Managing a Database Project
- Developing with Visual Studio, and Database Tools
- Microsoft SQL Management Studio
- Physical Data Modelling
This standard, being a coding standard, won’t elaborate on hardware. Despite, dealing with files, and file groups, strictly speaking also falls onto the configuration domain, do however exercise a direct influence on the code. When creating tables, and indices, their locations have to be specified, and once specified remains under source control. Therefore, if a file group is changed, code has to be checked out, and changed.
||Where any hints and tips are provides, this image is displayed. A hint, or tip is not regarded as a standard, but as a consideration, augmenting, or empowering the standard.
||Mandatory elements, are denoted with this sign. These are the standard, and must be honoured in the enterprise.
||Take note of a certain principal, or idea.
||Do not do this!
||Use it, don’t use it, make up your mind about it, whatever.
||That’s pure horse!
| Must, or Shall
|| This is a mandatory indication that the specified action must be performed. There is no optionality, or variableness intended with this proclamation.
| May, or Can
|| An optional element is expressed, to be considered in light of its context, and decided upon by its estimation of quality.
|| Indicative of strong indication of use for purpose supposed, but not obliged. For all purpose, and consideration, this should be used, and it is by qualification the right thing, however, it is not be enforced through the commands of mus,t or shall, and is therefore recommend, but not mandatory.
Through Standardisation a few qualities are achievable, e.g. legibility, maintainability, transportability, and longevity . There are few things so frustrating than inconsistency. Inconsistent naming conventions create heaps of problems, not only to the human eye, but to applications, systems, and reporting tools. If you ever worked on a large project, made up of several parallel streams of implementation, with common databases, you’ll find development often meanders their own path. This leas to general confusion, and the road of pursuit ends in a waste of time, and money, producing a solution that is very cumbersome to maintain.
Naming conventions may overlap with other IT functions. It could put control in various hands, which may be physically different departments. Different naming conventions can cause overall inconsistencies. This standard must be in conformance to those respective entities, from the point of view that general enterprise consensus must have been sought. Creating any silo-based standard, process flow, or policy, is counter productive.
- Any naming standard, is better than no standard
- There isn’t a one-and-only globally-correct standard
- A standard is only as good, as its objectives, and whether or not it is being used
- Use the standard, if it has been put in place, and do update it where out of practice
||Favour readability over brevity.Choose natural, and intuitive names.Create names that catch the eye, e.g. use camel-case (pascal-case) to create visual hooks. This notation may negate the use of the under-score, because it provides a significant visual differentiation in text.Consistency is more important than rule, but consider detrimental conventions, discussed in the “do not” indications.Using prefixes in object names, can help greatly with the organisation of objects, as they are displayed, not only in a database management user-interface, but also in source control, i.e. in a database project.When using a prefix, it must be used in the context of describing the object, in relation to its business taxonomy, e.g. CBS_ (Cbs) for Client Billing System, or CRM_ (Crm) for Customer Relationship Manager.Consider the use of Schemas, when considering a functional prefix. The two concepts might clash, for the name, e.g.
- SELECT CustomerID FROM [Cbs].[Customers]
- SELECT CustomerID FROM [Cbs].[CbsCustomers], OR [Cbs].[CBS_Customers]
- SELECT CustomerID FROM [dbo].[CbsCustomers], OR [dbo].[CBS_Customers]
- Avoid abbreviations, as the main name of an object, e.g. use Invoice instead of Inv.
- Don’t write in UPPER CASE, its archaic, and very anoying.Never use sp_ (for a stored procedure’s prefix), and never tbl_ (for a table’s prefix), and never col_, Pk_, Fk_, for a column’s name. See nota bene for the two variations, when using prefixes in object names, otherwise never prefix an object name, because it is either stating the obvious, or may impact performance, such as the sp_ prefix resulting lookup, for strored procedures.
- Avoid spaces, and special characters (with the provision on the under-score) in the names, even if the database allows it.
- Don’t include table names in the names of table attributes, where the relationhip is obvious.
- Don’t use any identifiers/ words that conflict with other programming languages.
- Don’t use language-specific keywords, for names, e.g. GetDecimal, rather GetLength [6.].
|| There are variations, on the usage of a prefix, in the name of a database object. A couple of examples are:
- a user-defined function, and
- a view.
It is beneficial to prefix these with udf_, and vw_ (view), respectively. It makes it clear what is being joined to, in a SELECT statement, which is to differentiate it from a table. For all other object, don’t.
||Standardised naming for database objects, must be applied universally throughout a project, and the enterprise. Consistency is more important than the choice.
|| Name prefixes, may either be a three-digit capitalised acronym, followed by an under-score, or a three-digit camel-case acronym with no under-score. The format depends on the organisation’s subjective preference. Whatever is decided, consistency is more important than the choice. Apply the same convention throughout the project, or better, the organisation.
In the table below, is a list of must use, naming standards, for database implementations for the organisation.
Naming Standards by Type
This is the name given to a database, i.e. the one displayed by Microsoft SQL Server Management Studio (SSMS), under [Databases].Use camel-case to format the name. It is important, to be as descriptive about the name, describing the functionality of the system, or business function, for which the database is being implemented.
- Do not use an acronym, as database name, even if the system is known by its acronym.
- Do not use numbers, e.g. ClientBilling1 & ClientBilling2.
- Do not make the name too long, restrict to about thirty characters.Do not start names with under-scores.
- Never use a space in the name, even if the system allows it.
- Do not use abbreviations in the name, e.g. CustRelManagement, for CustomerRelashionsipManagement.
Write out the functional description of the system, or company (depending on the size of the implementation) as succinctly as possible.
The company name is Acme Corporation, and it only uses one database:
AdventureWorks (Microsoft’s sample database)
The company has a huge database implementation, comprising several transactional, data warehouse, virtual databases, and some data-marts:
- When naming a SQL Server instance, take note that the instance has the server’s name, as the first part of the full name, e.g. <server name><instance name>.
- The server name would be a name defined by the Infrastructure department, in accordance with their server naming conventions. As database administrator, or architect, you may not have any influence over that. It may also be the case for the instance name, and it could be something as non-descriptive (to you) as: SQL02.The fully qualified name for a database is:
- A Database [Instance] is optional, so when reading the name, it should make sense about the database.
- View the database name, as a function of the fully qualified name, e.g. <server>.[instance].<databasename>. Name the instance, as a function of:
- The application being serviced
- The domain of the instance, viz. Production, User Acceptance Testing, Testing, or Development
- For example: D4DBSL0121SupplyChainProd. Purchasing, where SupplyChainProd denotes the application, and the domain or environment.
- When specifying a linked-server name, use the instance name, or if the server has no instance, create the linked server’s name, in accordance with the instance’s criteria.
Database File Group Names, and Database File Names
A database File Group, is a logical name, as container, for database files.The default File Group name is: PRIMARYDatabase files are comprised of [3.]:
- Secondary and,
- Log Files
Database Files And File Groups
A database File Group, is a development entity, i.e. is is put under source control, as part of the database object’s schema definition. For example, when a table is defined, it’s FILE GROUP is specified as the logical target with which the table’s creation is associated.Provide a meaningful name, denoting the kind of object, and/ or the business designation of the object.For example, grouping tables by function, or Input/ Output characteristics, e.g. Trades, and Orders, split between them, into [TRADES] and [ORDERS] file groups. Data, inserted into the respective tables, are stored in data files, according to the FILE GROUPs.
File Group Name Examples:
Data File Name Examples:
||There is no truth in creating multiple log files, to increase performance, the file is sequentially written to, and provides no advantage in having multiples. Create only one. [4.]
||Use a numbering system, to separate multiple user-defined data files, for the same DATA GROUP. By virtue of being a data-file, the database designer might not have control over the naming convention, but only over the FILE GROUP.
A table’s name should be viewed as a construct between the owner/ role, and the actual name: [<role>].[<table name>], e.g.
- [Cbs].[CustomerInvoices], or
- [dbo].[CbsCustomerInvoices], or
Where tables are referenced by systems, but do not belong to a particular system, use n system-agnostic prefix, e.g:
- [RDA].[Customer], or
- [dbo].[RDA_Customer], or
- [dbo].[RdaCustomer], or
- [dbo].[ADM_Customer], or
Note: RDA – Reference Data, or ADM – Administrative, i.e. it’s not system specific, but common to a couple, or few systems.
Kinds of database tables are:
- Normal tables – containing data for a specific entity
- Business Data – Business objects, e.g. Orders, Customers, Invoices.
- Application Data – User interface object data, e.g. drop-down list specific
- Join/ configuration tables – contain configuration data, e.g. Customers joined toOrders
- Temporary tables – temporary calculation staging results
- Session scope – visible to the connected SPID (server process identity) that created it
- Global scope – visible all SPIDs, using it, inclusive of the creator, until all SPIDs disconnect
The naming convention should be similar, for each of the table types.
- Create table-names in the plural, e.g. Customers, and not Customer, or Personnel, or Employees.Configuration (junction) tables must be named:
- <Primary table name> And <Secondary table name><Configuration>
- Example: CustomersAndOrdersConfiguration
- Refer to the ISO/IEC 11179 standard [7.].
- Use a prefix, for grouping tables according to the business system in which they are used.
- Example: Reference Data can be prefixed with RDA_ (Rda), or ADM_ (Adm), the latter meaning Administration. Reference data may not belong to a particular business system. For example, Customer Billing might use Customers, while Business Surveillance might use Billing, as well as Customer tables.
- For tables that belong to a user-interface (application screen) only, a prefix may be used, to group them separately from business specific tables, where these are defined in the same database, as the business objects.
- Do not prefix a table name with a superfluous acronym like “tbl_”, it just serves to make you look stupid.
- Don’t use any classification acronym, or word, to differentiate variable temporary, and global temporary tables. They have their language-specific classification already, e.g. @ (variable), # (temporary), and (##) global temporary tables.
A view is noting other than a virtual table, or a definition (query) for data to be materialised, at run-time, from defined tables, according to the view’s defined schema. The naming convention for a view is the same, as for a table, with added specialisations:
- A view is a specialisation of a table, but different to a table. Because of the similarity in which they are defined in a SQL query, it might be required that the database developer is aware of this. Therefore, the naming of the view, must be different, so that it is apparent that it is a view:
- Add a prefix (Vw, or View) to the view’s name.
- Views can denote a more complex result than its base (source) tables. The view might compound the query logic, by combining base results, even aggregates, in a materialised view, at run-time. Therefore, the naming convention must accommodate the view’s function, as much as possible.
- The naming of views, may contribute to ease of development, and trouble resolution, because the object (from where data is requested) is readily identifiable as a view, and would not be mistaken for a table.
- Views can be used as security principals, to provide users access to the materialised results, even where they might not have access to tables queried by the view.
||Use the prefix Vw, or View, to denote the object as a view, in order to differentiate it from normal tables, and user-defined functions, all of which are selected from, or joined on, in a standard SQL query.
|| Views can be utilised as reporting entities, or as specific data queries. You could consider customising the naming, to reflect this.
- Reporting: [dbo].[VwSalesAnnualCustomerSalesSummary]
- Specific Data: [dbo].[VwInvoicesCustomersAndOrders]
Table attributes are part of a table’s schema definition. Because the table is the super structure, it is obvious that the attributes belong, or are in association with the table. Therefore, the table’s name does not have to be repeated in the naming convention of the attribute.
Columns describe the properties of the table. The column’s name must describe the purpose, and value of the column, as succinctly and naturally as possible:
- Columns with identity fields, must be so indicated, e.g. EmployeeId, or InvoiceNumber. These columns must be of Integer or BigInteger data-types
- Columns with code identities, must be so indicated, e.g. ProductCode, or InvoiceNumber. These columns must be of Varchar, or NVarchar data types
- Columns with descriptions, must be so indicated, e.g. ProductDescription. These columns must be of Varchar, or NVarchar data types
- Columns with boolean types, must be so indicates, e.g. IsActive, HasAccess, IsValid.
Note: The suffix of Number (as an identity) could either denote an integer or string value, but where a code is used, it is an alpha-numeric (string) type value.
A table’s primary key, must be prefixed with “PK_”, and indicate if it is clustered or non-clustered. In Visual Studio 2012, the primary key should not be created separately, i.e. in a Keys folder. If this is done, the functionality of Visual Studio is short-changed, where the primary key can be defined, using the graphic table designer screen, by simply clicking the field as a primary key. Note: Do inspect the name given, as this standard prescribes a different convention.
A table’s foreign key, must have the same name as the column in the primary-key table, preceded by the name of the primary table.
Keys must be defined, respectively, as follows:
- “PK_CIX” Primary Key, clustered index (uniqueness is inferred): PK_CIX_CustomerID
- “PK_NCIX” Primary Key, non-clustered index (uniqueness is inferred): PK_NCIX_CustomerCode
- “FK_” Foreign Key:
- FK_<primary table name>.<primary key>_<foreign key column name>: FK_Customer_PK_CustomerID_InvoiceID
Table constraints are part of the table.
Default and Check Constraints
- “CK_” Check constraint, e.g. on date range: CK_DateRange
- “DF_” Default constraint, e.g. on ProcessingDate: DF_ProcessingDate
Triggers are like stored procedures, but attached to a table, and acts in response to a table action, viz. Insert, Delete, or Update, in an event-driven fashion, invoked by SQL Server.
The trigger name should be natural, and descriptive. It belongs to a table, so it is not necessary to mention the table, again, in the name.
- <prefix (t|Trg|trg|Trigger|trigger)>[_]<Trigger Type (After|InsteadOf)><Action (Insert|Update|Delete)>
Indices are created as the table, for clustered indices, and off-tables, as non-clustered indices.
Clustered, Clustered Unique, Non-clustered, and Non-clustered non-unique.
A table can only have one clustered index, because it is the table, and multiple non-clustered indices, because they are created off-table. A clustered index can be the primary Key, but it does not have to be. A primary key should only exist, when the table has referential integrity. A primary key ensures uniqueness, and non-null-ability. Both clustered and non-clustered indices can be non-unique. However, a clustered index, as primary key, cannot be.
A primary key that is a clustered index, takes the naming convention of the primary key, described next, under keys.
- “NCIX_U_” Unique, non-clustered index: NCIX_U_MemberIndustryCode
- “NCIX_NU_” Non-Unique, non-clustered index: NCIX_NU_MemberIndustryCode
- “CIX_U” Clustered index, unique: CIX_U_CustomerId
SQL Server, when auto generate statistics, is turned on, automatically creates table statistics, by providing a system name, e.g. _WA_Sys_00003_3CF40B7E [8.].
When creating a user-defined table column statistic, any (free-text) name can be given. The statistic is bound to a table’s column, and listed under [Statistics]. It is beneficial to know on which column it is created, the table is immaterial, because it is obvious.
Generic name: <prefix (ust|ustat)>[_]<Column Name>
||Do prefix table constraint-names, with the appropriate prefixDo prefix keys (primary & foreign), with the appropriate prefixDo prefix indices, with the appropriate prefix, and specify the uniwue/ non-unique indicator.
|| A column may have the table’s name, in its name, e.g. CustomerId, as opposed to Id. When using a table, and its columns in a query, the table name must be aliased, and also, consequently the column. This should provide a unique reference, and a designation of which table the column belongs, negating the necessity of having the table’s name in the column name. This, however, depends on how well the SQL query is written, i.e. how attentively this standard is being followed.
- Do not prefix a table’s column name with: COL_, or col_, or cl, or anything to denote that it is a column.
- Any table attribute that does not reference another table, must not have the table’s name, as part of its name.
- Do not name a column, by specifying “Id” in the name, where it is not a numeric data type.
- Similarly, do not specify “code” in the column’s name, if it is not of a string type.Do not name a boolean column without a boolean descriptor, e.g. Is, Has.
- A clustered index, does not have to be the Primary Key. A non-clustered index, can be the primary key, given the index is unique, and not-null.
- A primary key does not have to be the auto-incremented integer column of a table. However, a primary key must be unique [5.].
User-defined functions are used in similar fashion to tables, and views when joined to, or selected from, in SQL queries. For this reason, their names should be considered akin to these other objects.A user-defined function also shares characteristics with stored procedures, especially because they are able to return a scalar-value. It also has a parameter list, or input value-list, such as stored procedures, and unlike tables, and views.Another goal to strive towards, is finding a user-defined function, if a database has many of them. To achieve this, one should add some attribute.Furthermore, a function has many functions. It can be used to selectBecause of these similarities, and especially on the bearing towards views, and tables, the naming convention of a user-defined function should reflect uniqueness.
To avoid going on about: “we are all related”, let’s set a standard.
Prefix:Do prefix the name of a user-defined function with:
- ft_ Tablular function
- fs_ Scalar function
- fa_ Aggregate function
Application (Business Specialisation) and Sub-application:
The function’s name, must be able to indicate meaning, within the context of the query, so that developers are able to easily identify them accordingly. The function’s name must also be locatible where referenced in the database management, or the development application. For this purpose, an Application, and Sub-application identifier are used.
This is the functionality or processing, performed by the user-defined function. It denotes the name of the function.
- <prefix>[<Business Specialisation>][_][<Business Sub-specialisation>]<Process>
- ftCustomerAndCustomerRoles(), or
- ft_CBL_CustomerAndCustomerRoles() [where CBL is the business specialisation], or
- ft_CBL_Invoicing_CustomerStatements() [where Invoicing is the sib-business specialisation]
NOTE: The usage of the under-score [_] should be gauged, to see that it makes legible sense.
Disambiguation of parameters, alleviates many a development woe. Distinguish parameters from locally defined variables.
Name: The name of a parameter is the same type of name for any variable. See later section on variable names.
Prefix: A prefix of “p”, e.g. pBillingRunDate, is sufficient to describe a user-defind function’s parameter.
Generic name: <prefix (p) [p-input parameter])<Parameter Name>
||Do prefix the user-defind function, so as to differentiate its imlementation with that of views and tables.User-defined function parameters must be prefixed, simply for disambiguation.
||Do not add a verb, as action descriptor, to the function’s name, e.g. dbo.ftGetCustomerAndCustomerRoles, or ftSelectCustomerBillingInformation. The reason being, that any development environment, or editor, sorts alphabetically, and will consequently group all the Get’s, and all the Select’s together. This is counter-intuitive to any human being, looking for objects by business specialisation. A function is a selection mechanism, so one is always to expect something to be returned, and not that the database is maintained.
Stored Procedure, and Stored Procedure Parameters
A stored procedure is a container for storing executible code in a database. The code can execute, very much like that of a view, function, or trigger. However, there is a distinct difference, in that a stored procedures can not be used in a SQL statement for INSERT, UPDATE, DELETE, or SELECT. The keyword EXECUTE is alway associated with the execution of a stored procedure. This modus of execution, makes the stored procedure unique. Therefore, it does not have to be identified in a specific manner (such as adding a prefix), because it can’t be confused.
A stored procedure is able to return results in four ways, viz. by result-set (table), scalar-value (numeric), return value (simplistic boolean or integer) value, or output parameter(s).
Application (Business Specialisation) and Sub-application:
The stored procedure’s name, must be locatible, where referenced in a database management, or development application. For this purpose, an Application, and Sub-application identifier are used.
Because a stored procedure, is a multi-purpose functional database code container, it not only returns data, but has the ability to insert, update and delete data from a store. For that reason, the name of the stored procedure has to be disambiguated. To this end, an Action descriptor, is supplied.
[<Business Specialisation>][_][<Business Sub-specialisation>][_][<Action>]<Process>
Stored Procedure Parameters:
Disambiquation of parameters, alleviates many a development woe. For example, within a large stored procedure, it is expedient to know which values are local variables, input, or output parameters. It makes it much easier for a developer to identify values, and aids to faster development time.
Name: The name of a parameter is the same type of name for any variable. See later section on variable names.
Prefix: A prefix of “p”, e.g. pBillingRunDate, is sufficient to describe a stored procedure’s input parameter.
Consequently, an output parameter prefix, should be: oNumberOfInvoicesInStatement.
Generic name: <prefix (p|o) [p-input parameter|o-output parameter])<Parameter Name>
A return value, is any functional local variable that is wrapped in the RETURN(<local variable:integer>) function. The function denotes its type, i.e. return value, and the function parameter, the data type, i.e. integer.
- Use a natural name that describes the function of the stored procedure as succinctly as possible.
- Be consistent in the choosing of the stored procedure’s name, from the Generic name specification.
- Disambiguate stored procedure parameters, by prefixing them.
- Stored procedure parameters, must be prefixed, according to the type, and their names, specified by the rules of variables.
- Do not prefix the stored procedure’s name with sp_, because it is reserved for system stored procedures, and may cause a future name clash.
- Secondly, where a name clash is the case, SQL Server first executes the system stored procedure.
- Thirdly, the server first looks in master, to see if it can find a system stored procedure (and if not), executes the user-defined one, given that a fully qualified referenced name is not used
- Do not use another type of prefix, other than sp_, e.g. usp_ (user stored procedure)Do not use verbs in the name, aimed at describing the action, e.g. Select, or Get. It causes object sorting in development and administration environments to sort counter-intuitively.
||When considering using a Business Specialisation, and Business Sub-specialisation, especially in connection with Action, consider the sorting order of the names of stored procedures. Omitting any of these, or using a combination of some, might cause bizare sorting arrangements.
Variables, and Parameters
Any database code object, e.g. user-defined function, or stored procedure, can take parameters, and can have local variables. For ease of identification, in vain with the entire purpose of naming conventions, these also fall part of the same regimen.Local Variables:
Objects, viz. stored procedures, and user-defined functions, have parameters. They are specialised in that a stored procedure has both input, and output parameters, while a function only takes input parameters. The naming convention follows prescriptions for each object, and generally according to local variables.
But, as with many things, there are variations, and it depends on the variable’s scope in application. On large projects, the coordination between database objects, and applications, can be performed directly, via a Data Virtualisation stratagem, e.g. Object Relational Mapping (ORM) solutions. Keeping track of which variable is what, and what data type conversion to do, can become tedious.
To resolve such predicament, the variable can be decorated, so that it might be identified for more than one scenario.
<prefix (p|o) [p-input parameter|o-output parameter])<data-type acronym><Parameter Name>
User Defined Function:
<prefix (p) [(p-input parameter])<data-type acronym><Parameter Name>
These are named as closely to what they are, e.g. EmployeeIdentificationNumber, as logic allows.
Generic name: <@v><Variable Name>
NOTE: It is possible to have variables that are used like constants. For these, replace the <@v>, with <@c>, to denote the difference.
||Prefix variables and parameters.
||For large projects, include a datatype descriptor in the parameter’s name, to make it easier for developers to correlate database and application parameters.
Suggested SQL-Data-Type-Prefix Acronyms, for Parameters
Using acronyms, described below, as prefixes for SQL object parameters, they can be readily identified on the application side.
SQL Data Type
SQL Server development comprises all T-SQL (transact SQL), and C# Common Language Runtime (CLR) development.
In code, formatting is paramount to a legible, and maintainable code base. Some formatting aspects are subjective, and organisations, even persons, may differ diametrically. This standard is one way of standardising database development code formatting.
A development template (boiler-plate) is a very good mechanism through which to provide developer guidance. The following description of standardisation, is combined, and delivered in a template.
Formatting a Basic SQL Statement
The basic SQL statement serves to illustrate keyword capitalisation, and statement formatting.
Basic SQL Statement Formatting
The basics is to format the statement, with future additions in mind.The placement of the comma-separators, for table columns, is subjective. Many put the comma after the column name, but that causes a ragged placement, which might not produce the most pro-CDO outcome.
Getting the basic statement correctly formatted, helps with more complicated SQL.
- Capitalise keywords.
- Use a 3-tab indent, between SELECT and the first column name, which is also applicable for the spacing between FROM and the table name. This accommodates later usage of LEFT OUTER JOIN keywords.
- Use a fully qualified name (three part name), as reference to the table name, from which to select.Use block-brackets, for words, same as keywords, or where they contain spaces.
||Use block-brackets around column names, and table references. This is a subjective coding style, unless the word being .
Formatting a more Complex SQL Statement
A complex SQL statement includes JOINs, nested statements, aliasing, or is even a Common Table Expression.
The SQL statement depicts a more complex query, one with a Common Table Expression (CTE), and a JOIN. When looking at SQL code, it must be pleasant. Uniformity, without obsessive compulsiveness.
Aligning the code, by keyword, and statement nesting-scope, causes a pleasant uniformity. The alignment of the aliases, can be according the nesting, but it looks nicer, just like the preceding comma placement that avoids a ragged appearance.
Note: The code is a database performance query, written by Glenn Berry (sorry–formatted by me).
The example does not use a LEFT OUTER JOIN, which would have indicated the requirement for a three-tab indentation, between SELECT and the column name, or FROM and the table reference.
- Use tabs, or spaces, to offset and format code lines, by keeping lines in the same columnar alignment, according to the scope of the nesting.Place all alias definitions inside square (block) brackets and specify the AS keyword, to denote them.
- Where referencing a column, by alias, ensure that it uses a bracketed instance of the reference. This significantly increases readability, and disambiguates the column-to-table relationships.
- Be as pedantic as feasible, by ending statements with semi-colons.
- Take note, that a statement preceding a CTE, must be terminated by a semi-colon, otherwise the CTE throws the error: “Incorrect syntax near the keyword ‘with’“.
- Make aliases descriptive, yet short. There’s nothing as silly as having an alias [X], or [Y]. When the SQL statement is long, one could easily lose the context of the referenced column, because x, and y don’t mean much.
|| Do not use a column number, when specifying the ORDER BY predicate.
Stored Procedure Formatting
A stored procedure is a typical code container, able to serve as demonstration of this standards, for all other code containers, e.g. user-defined functions, triggers, or SQL scripts.
The format of a stored procedure’s create script, is comprised of:
- Create statement
- Procedure Name
- Input, and Output Parameters
- Execution (security) Context
- Flower Box
- Abstract, describing the function of the procedure
- Author details
- Version details
- Declarations, Initialisations & Validations
- Setting of Directives and Transaction Isolation Levels
- Local Variables, and their initialisations
- Any validations on parameters
- Main Code Execution
- SQL Query
- Return Value
- Error Handling
- Custom Logging
- Return value
- Clean-up, or Housekeeping
- Transaction Isolation Levels
- Additional Information Box
- Edit history
- Any other pertinent information
- Extended Properties
- Functional Description
- Current Editor
- Edit Date
- Template Version
The Stored Procedure CREATE Statement
The CREATE statement comprises the procedure’s name, parameters, and an EXECUTE AS specification.The WITH EXECUTE AS statement provides the stored procedure with a user context to use, for validating permissions on objects that are referenced by the module.
The parameters, differentiated between input and output, are prefixed accordingly.
The Stored Procedure Flower Box
A flower box is probably a spill-over from main-frame programmers. SQL coding, with the exception of SQL CLR, is very main-frame like. It has the same top-down layout as Adabas, and Cobol programs.However that sides, I like it.
One could argue that developer interfaces, such as Visual Studio, with Team Foundation Server (TFS), keeps more than adequate track of versions and authors, and that by specifying attributes, duplicate work is being undertaken. That may be so, but more often than not, branch information in TFS gets turfed, objects dropped from the database, and there goes the meta data history. It is, unfortunately, dependent on being maintained by the developer, but with a firm development life-cycle practice, good process and management, and above all, the adherence to a standard such as this one, it shouldn’t be a problem.
An aspect that seems helpful, is to maintain an example of the object’s execution. More often than not, the database on which it is deployed, still has valid data, for such a test-type example to execute and return meaningful data. It helps, with later maintenance. Also, it might be a while since a developer has used output values, or a returned value, and such an example lends itself to a speedy latch.
Mention of the Coding Standard, used when last dealing with this object, might also be a much use, especially to the current editor.
People come and go, so the name of a developer, especially where contractors were used, may not be important. However, if that person is still available, given memory server him well, some guidance may still be sought from such.
The Stored Procedure Declaration, Initialisation, and Validation
The Declaration, Initialisation, and Validation section, allows for the setting of directives, transactional isolation, local variables, and (if required) validation on input parameters.
The SET NOCOUNT ON, if this is not familiar yet, suppresses row count information from being returned to a calling application, as well as, depending on whether it’s specified before @@ROWCOUNT, resets its value, as any statement would.
Transaction isolation is set in accordance with the specific requirement of this stored procedure.
The XACT_ABORT, is a SQL 2005 optimisation that forces a roll-back and termination of the entire transaction (error severity level 11-16), on error. If it is left off, only the transaction, causing the error, is rolled back, i.e. SQL chooses how to resolve error. It’s used in combination with TRY. . . CATCH, the latter to handle the error and the XACT_ABORT to determine what to do if an error happens. It would appear that using XACT_ABORT, by itself, with no TRY. . . CATCH, is faster [9.].
Standard error variables, and specific query variables are declared separately. Their initialisation, where applicable, can be done in the same statement, as the declaration. Note the variable prefixes.Alignment is important.
See the variable declarations, aligning not only the names, but also the data types.Take note of the headings. See the later section on textual (comment) decorations.
- Do set the directive NOCOUNT to ON.
- Do the basic alignment, using either tabs, or spaces.
- Prefix local variables. Use @v – as local variable indicator.
- Do not forget to SET NOCOUNT ON;
- Do not cause any ragged-edge alignments.
- Consider the transaction isolation level. By not specifying it, the default of READ COMMITTED, is assumed.
- Initialise variables, as part of the declaration, where it makes sense.
Main Code Execution
For a stored procedure, the main code execution section, is the guts of the object’s functionality, and the purpose for which it had been written. As mentioned above, transactional handling can be wrapped in a TRY. . . CATCH with XACT_ABORT ON, or just run with XACT_ABORT ON. However, the stored procedure is the object of choice, for error handling, and custom logging, where implemented.
The example, indicates that the executive SQL code, is wrapped in a TRY. . . CATCH. Notice also, the RETURN, in this example, sending back a specific integer values, one that has meaning to the context for which it is returned. If the procedure does not throw an error, the return value is something meaningful, in context of the execution [10.], but not a business relevant figure, and not NULL.
Wrapping the SQL statement in TRANSACTION depends on the nested level of the stored procedure, or how the transactional management is being implemented. But, if it’s handled in this stored procedure, it must be wrapped. With SNAPSHOT isolation, even a pure SELECT statement must be wrapped [11.].
- Do demarcate this section with an appropriate banner, i.e. flower-box textual comment.
- Do the basic alignment, using either tabs, or spaces, see the formatting of SQL statements earlier in this standard.
- Use TRY. . . CATCH, for error handling, especially for custom logging.
- Utilise return values, where applications can benefit from it.
Error Handling (Catch)
The CATCH, or error handler, is an important section of a stored procedure, if you would like to know why it went boom!
It can get very complicated, depending on the level, and finesse of the handling. The XACT_STATE (ms-help link provided) can be inspected to determine the state of the transaction. Depending on the nesting-level, i.e. the sequence level of stored procedure call, the error might be handled differently. SQL Server returns the error tree, one for each stored procedure, when it fails, to the first stored procedure in the call-stack.
The RAISERROR, where the procedure failed to commit the transaction, bubbles-up to the calling procedure. This is also the place, in the code, where a custom error handler can be called. It might be expedient to log stored procedure errors, by sending the relevant details to a handler (other stored procedure) that writes the information to a database table, or central error repository. If such events are logged to the system’s event log, enterprise exception applications, such as System Centre Operations Manager, can retrieve the log and report it to operational staff, in the company, for resolution.
This example provides a guide-line, but the specific implementation will depend on the available error logging abilities of a company. In this case, a descriptive error message is generated, indicating the error line, procedure, error number, severity and state, and sent to the calling procedure.
Clean-up, or Housekeeping
Crossing the t’s, and dotting the i’s.
SQL Server will, when the query exits, resume its defaults, so strictly speaking, these actions are not necessary.
Additional Information Box
This “flower-box” decoration, is quite the same thing as described above, as the flower-box. In this instance, it keeps information about the edit history of the object, and other pertinent information about the object.
Again, one could argue that source code team systems keep an edit history, but still, these also lose the history.
Extended properties, are user-defined values that become attributes of a stored procedure, and are referenced via Stored Procedure Properties, after it has been deployed to the database.
Why use them? The argument is similar to using the flower-box for adding Additional Information to the code of the stored procedure. It’s also a step towards meta-data, and a data dictionary, plus it’s also a documentation aid. It can be very powerful where applications read these properties, in order to evaluate the object consistently, and treat the data in the same way [12.]. With Visual Studio 2012, or adding SQL Server Data Tools to an existing Visual Studio 2010 environment, extended properties are properly supported. The extended property statement is appended to the end of the stored procedure create script, or as a file of the type “extendedproperty.sql” in the folder <objects> to which it is attributable.
Functions have three types, viz. tabular, scalar, and aggregate. The scalar and tabular functions can be applied in a variety of ways, e.g. in-line, or multi-statement functions, as part of a SELECT statement. The aggregate functions are created through CLR integration. With SQL Data Tools, or Visual Studio/ SQL Server 2012 development, CLR objects, and the assembly, has been made part of the build/ deploy. It’s drop-dead-easy to work with database projects since this addition.
The standard is similar for SQL user-defined functions than for stored procedures.
SQL Code Commenting
Code comments are required for formatting, as well as functional, and code implementation clarifications. So far, flower-box inclusions, as part of the code format, has been specified. Other commenting styles are:
- Single line code clarifications
- Paragraph style introductions
- Use banners, to separate distinct functional areas, including a succinct description of the section’s function.
- Use a single line delineator where a section has to be headed, but there is no need of a verbose description
- Ensure that spacing is provided, as indicated, to align comments, and beautify their visual appeal
- Variations in commenting, such as using a banner with description, but the first prefixed by the double-dash, and the second with a slash-star, is not acceptable. It must be used consistently, the one, or the other.
||It does not matter what style of commenting is used, as long as it is used consistently.
SQL Common Language Runtime (CLR) Integration Standards
When to use the CLR
The CLR, for SQL Server, has been around since SQL Server 2005. To develop SQL Server Database projects, Visual Studio could be used, but unfortunately, it was not that easy. Only after the release of SQL Data Tools [13.], and Visual Studio 2012, did it (in my opinion) become a feasible thing to make easy use of the CLR capability. This, especially because it can now be deployed in one-click-style. Before it was a very cumbersome process, outside of the database project, for most of the deployment, requiring quite a bit of manual labour.
The rule-of-thumb, for using CLR objects in a database project, is to perform operations that are too clumsy for SQL to do. What are they [14.][17.]?
- String manipulation – dynamic SQL constructions
- Regular Expressions [15.]
- Computation orientated queries, requiring the .net framework, e.g. the Black-Scholes option pricing model [16.]
- SQL Process Controllers, with thread management, or iterations
- Do use the CLR where the code to be executed is:
- has to utilise regular expressions, or
- is a decision-rich construction of dynamic SQL
- Do run batch-controllers, and data-intensive process controllers on the database, by coding them in the CLR. Note: define what is meant by “batch controller”, and don’t implement an application that is better suited in a .net container. See “whatever” below, for further clarity on this.
- Do use CLR objects, e.g. stored procedure, for creating dynamic SQL that is dependent on elaborate business rules, comprising many input parameters. The query-plan, resulting from the submission of a dynamic SQL call through sp_executesql, with parametrisation, might be a much more optimal plan, than including a myriad of input parameters, through standard TSQL.
- Use SQL Data Tools (Visual Studio 2010/ 2012) projects to incorporate CLR database development. The datadude version is way too cumbersome. Note: In Visual Studio 2010 with SQL Data Tools, the project properties must be set to .net 3.5, it is not capable of .net 4.
- Enable the clr function on the database server.
|Do not use CLR objects for manipulating set-based data sets, i.e. a function that should ideally be coded in traditional TSQL.Do not set a name space in the CLR class.
- A business process controller, some code that schedules or brokers decisions about the execution of database objects, based on some business rules, can be coded in .net, to run out of a managed code container, such as a C# application of Windows service. This is probably the best space to do it in, if it entails threading.
- However, when that controller is a data processing function, it might just be that the closer to the data it executes, the better. It also depends on exactly what is done by the controller. One wouldn’t implement elaborate controller that could function in systems such as Control-M, or any other batch controllers. It would be a menial object with a more than average number of IF. . . ELSE decisional statements, compounded by string manipulations and the likes.
CLR Database Interaction
Using CLR objects facilitates the CLR .net language run-time, in this case C#, to be available for writing the objects, and using the available framework functionality to implement them. This is a jump up from the database, where SQL lives. However, interaction between the CLR and database strata is not only possible, but would most probably occur. This standard details the SqlFunction, and SlqProcedure variances.
CLR SqlFunctions and SqlStoredProcedures are defined, using attributes . The Microsoft.SqlServer.Server namespace contains every possible attribute. The first method in an object, is the object type, e.g. SqlFunction, but it can also be decorated by attributes, as directives or properties for the object built in the datbabase.
A CLR user-defined function can be decorated with a few attributes .
- IsDeterministic – Idempotent if true, i.e. returns same results when called with same parameters
- DataAccess – DataAccessKind.None/ Read, if reading data from context SQL connection
- SystemDataAccess – Reading system tables, SystemDataAccessKind.None/ Read
- IsPrecise – Determines if to do floating point operations, true does not use FLOAT
- FillRowMethodName – Method in CLR code, used to perform table-valued filling of data
- TableDefinition – The table structure to be returned by the TVF
- Name – Name of the SQL function, the one Published to the database
||Use regular connection when connecting to a remote server  for data.Configure the CLR assembly for external data access [EXTERNAL_ACCESS] on the project properties page, if data-access, outside of the local server is required. Note: The security context is SQL Server service account.
||Don’t pass table-valued-parameters to CLR functions, it can’t be done.Don’t use returned integers other than SqlInt32, or System.Int32.
CLR Code Formatting
CLR code, is essentially .net code. In this standard’s case C#. The same formatting rules, governing C# code, should be applicable in the database CLR. There’s no need to step on the toes of an existing application coding standards. It should be honoured in this standard.
Glossary of Terms
Word, or Acronym
Meaning, and Description
|| A repository, in which data is stored, usually a relational database. It is a logical grouping of data files, each stored on a disk location.
| Relational Database
|| This is a term, used to describe the popular notion of a database. This is a structure in which data is stored, made up of objects, such as tables, and other programmable entities, such as stored procedures and functions. The relational model was first defined by Edgar Codd of IBM in the 1970’s.
|| A table is an object of a database, in which data is stored in rows and columns.
|| A view is an object of a database, and is a level of data virtualisation, known by and referred to as a derived table. It achieves data encapsulation, and abstraction for the source table(s), and their structures.
| User Defined Function
|| A user-defined function is a programmable object of a database, and is an implementation of a virtual table. When used, in a T-SQL statement, it can return a tabular result-set, or scalar value to a calling object, or SELECT statement. The T-SQL code of a user-defined function, cannot be dynamic SQL.
|| A stored procedure is a programmable object of a database, which comprises T-SQL statement(s) that can be dynamic SQL, and returns a tabular result-set, a scalar value, or an output parameter scalar value, to the calling object. A stored procedure is executed, and cannot be part of a SELECT statement.
Last Updated: 31 October 2012, 16h25