Coding and database standards like a religion or politics. Some people are fanatics and others really don’t care whatsoever. Getting stuck between any combination of the right and left or those that don’t care can be entertaining and sometimes outright dangerous.
The HBO show Silicon Valley has a whole episode based on formatting (tabs vs spaces). But that’s an entirely different topic. For now I’d like to quickly discuss naming of objects.
CamelCase vs Using_Underscores
CamelCase naming is achieved by concatenating word together with each word starting with a capitol letter. Some examples of this are: CustomerOrders, UserPermissions, VenderAddresses.
I know some of you are already shaking your head at the fact that I used plurals for those objects. I’m not here to talk about that topic today so let’s put that on the side table and look at underscores.
Using the underscore naming convention each word would be separated by an underscore like the following: Customer_Orders, User_Permissions, Vendor_Addresses.
These conventions should be used not only for tables and views but for everything, including columns.
Let’s look at the following table:
CREATE TABLE HumanResources.EmployeePayHistory
( BusinessEntityID int NOT NULL,
RateChangeDate datetime NOT NULL,
Rate money NOT NULL,
PayFrequency tinyint NOT NULL,
ModifiedDate datetime NOT NULL
)
We need an index on RateChangeDate. Let’s see what that should look like:
CREATE NONCLUSTERED INDEX ix_ncl_EmployeePayHistory_RateChangeDate
ON HumanResources.EmployeePayHistory
( RateChangeDate ASC )
I like to separate the columns in an index name with underscores. I also like to start each index with ix_cl or ix_ncl depending on it being clustered or non-clustered.
Now what if this were created using underscores?
CREATE TABLE Human_Resources.Employee_Pay_History
( Business_Entity_ID int NOT NULL,
Rate_Change_Date datetime NOT NULL,
Rate money NOT NULL,
Pay_Frequency tinyint NOT NULL,
Modified_Date datetime NOT NULL
)
The same index would be named as follows:
CREATE NONCLUSTERED INDEX ix_ncl_Employee_Pay_History_Rate_Change_Date
ON Human_Resources.Employee_Pay_History
( Rate_Change_Date ASC )
This also applies to constraints and keys (PK, FK, UK). Assigning a default constraint to a table from the UI would result in something like the following: DF_Employee_Pay_History_Modified_Date. This just isn’t clean.
In my opinion, CamelCase looks cleaner and as you can see by the index naming convention I use it’s also much easier to parse.
I’m sure that there are a lot of strong opinions on this topic and I’d like to hear them all. Be sure to leave a comment.
Follow me on Twitter and LinkedIn and if you liked this post endorse me.
Technically I think what you are advocating for is PascalCase, not camelCase.
LikeLiked by 2 people
Good call!
LikeLiked by 1 person
Totally with you on CamelCase but why is the D in Id capitalised, it doesn’t start a new word? I see this a lot and it bugs me 🙂
LikeLiked by 1 person
I usually capitalize the D in ID. For some reason it looks wrong to me as Id. This is typically my only exception. I can see the argument for the lower case d though.
LikeLiked by 1 person
I agree with this, except for your use of plurals. Wasting characters.
LikeLiked by 1 person
Agreed.
LikeLiked by 1 person
The conventions were set for the databases I work on before I got to them. I like the convenience of underscores when copying data to Excel to present because I change underscores to blanks in the column headings and then wrap. Excel does nice line breaks in the heading at the blanks. It seems some of our shortest width data have the longest column names which wastes space in most presentations of if such as SSMS and Access. I have written regular expressions to put blanks in for when things become upper case followed by a lower case but this is only mostly effective and Excel doesn’t support regular expressions anyway.
LikeLike
You could always alias your column names for reports or extracts. Use brackets [ ] and put whatever you want.
LikeLike
I was about to comment about it being PascalCase too, but got ninja’ed.
However, the OP is a brave man for diving into the Format Wars. 🙂 Personally, for table and column names I like spaces as opposed to either concatenated words with caps or underscores, with table/column names in square brackets. I know that’s a very heretical position and comes with its own issues but I like the visual representation that results.
For *variable names* in stored procedures Pascal case makes a lot of sense since spaces simply aren’t allowed at all and underscores (IMO) should be reserved for constants (or pseudo-constants in T/SQL’s case) which should be entirely upper case.
True camel case (where the first letter of a name is lower case) is just dumb, again IMO. 🙂
Interesting article!
LikeLike
Many Business Intelligence tools such as Tableau automatically convert underscores to spaces and capitalize each word. Their conversion on PascalCase is Notpretty. Also, any attention to capitalization in SQL has to realize that it is treated differently in different DBMS. Many will automatically convert them to lower case in the data dictionary. Also, because capitalization is ignored by the parser many programmers, being lazy, will write their SQL in all upper case or all lower case, which is hard to read if the names were initially in PascalCase. It is better to use underscores so you can have a common naming standard across all DBMS.
LikeLike
I have always found PascalCase easier to read. I also use Underscores to separate organizing prefixes and suffixes to aid in parsing. Yes kudos to the OP for being brave enough to bring this up and kudos to the posters so far for being civilized. I have seen these discussions melt into gang turf wars that made the crusades look like scout campout weekends. I think it is important to look at readability. Pascal case allows to eyes to flow, even without the spaces present. Underscores are punctuation characters and we are trained in elementary school to mentally break at those characters. Again the mixing the way I explained earlier aids that. The name flows easily in the reading and if an important prefix or suffix has been added, the brain stops to recognize it. Just whatever you do, be consistent.
LikeLike
The trillions of times that I have spent pressing the Shift Key and the _ underscore were wasted. Life is too short. Long live CamelCase!
LikeLiked by 1 person
Camel/Pascal case is a real pain, particularly since I work on case-sensitive servers. It’s often very difficult to figure out, in advance, how each individual developer decides to capitalize or not. Besides, I honestly don’t see how anyone can say that this:
LastStockModifiedDate
is easier to read than:
last_stock_modified_date
And, in fact, studies have proven that camel case does slow comprehension of text: camel-cased identifiers take 13.5% to 20% longer to read.
Quick: Is it ZIPCode or ZipCode? (In case you’re not aware, ZIP *is* an abbreviation.)
Is it SSNAssignedDate or SsnAssignedDate?
Is it DOB or Dob?
LikeLiked by 1 person
@Scott, do you have a reference handy to support the stats you mentioned about the relative speed for comprehension of text?
LikeLike
Click to access ICPC2010-CamelCaseUnderScoreClouds.pdf
”
In response to the research questions posed in Section II,
we find that identifier style significantly affects time and
visual effort needed to correctly detect identifiers constructed
from a phrase. The underscore style is significantly faster
and positively influences the dependent variables. In the
Binkley et al. study [4], Phrase Length did not interact with
Style, however we find such an interaction in our analysis.
The common theme in both experiments is that camel-cased
identifiers take longer than underscored ones (13.5% in the
previous study and 20% in this study) overall. In the Binkley
et al. study, a higher accuracy was found for camel-cased
identifiers, however, in our study, all (except one) subjects
answered correctly on all questions making accuracy
comparisons irrelevant.
…
In the Binkley et al. study, only one-third of the subjects
were trained in camel-cased identifiers. In our case, we have
an equal proportion of experts (8) and novices (7). During
the demographic briefing, six subjects (40%) stated that they
preferred camel-case, seven (47%) stated that they preferred
underscore, and two (13%) had no preference. Also, in the
Binkley et al. study, non-programmers stated that camel
casing would be harder to visually process and thus lead to
more errors. Our results prove this claim to be true with the
data generated from the eye tracker (AFD(Q), AFD(correct),
AFD(distracters)).
“
LikeLiked by 1 person
As to “Id” vs, “ID”:
In general, I’m for “Id”.
But, really, shouldn’t it be “Id” if it means “identifier” but “ID” if it means, say, “Identity document”? Yikes, what a pain!
LikeLike
Just on a common sense level, how can anyone think that:
SalesTaxRateForNonGroceries
is as easy to comprehend as
sales_tax_rate_for_non_groceries
??
Again, ask 10 developers about Id/ID ZIP/Zip SSN/Sssn DOB/Dob ANSI/Ansi and I’ll bet you get some type of split; no way all of them will agree on all of those.
If you never work on case sensitive servers, that won’t matter as much to you. But to those who do, it can make a big different in how long it takes to find the correct column name.
LikeLike