Making the case for CamelCase naming

Coding and database standards like a religion or politics. Some people are fanatics and others really don’t care whatsoever. Getting stuck between any combination of the right and left or those that don’t care can be entertaining and sometimes outright dangerous.

The HBO show Silicon Valley has a whole episode based on formatting (tabs vs spaces). But that’s an entirely different topic. For now I’d like to quickly discuss naming of objects.

CamelCase vs Using_Underscores

CamelCase naming is achieved by concatenating word together with each word starting with a capitol letter. Some examples of this are: CustomerOrders, UserPermissions, VenderAddresses.

I know some of you are already shaking your head at the fact that I used plurals for those objects. I’m not here to talk about that topic today so let’s put that on the side table and look at underscores.

Using the underscore naming convention each word would be separated by an underscore like the following:  Customer_Orders, User_Permissions, Vendor_Addresses.

These conventions should be used not only for tables and views but for everything, including columns.

Let’s look at the following table:

CREATE TABLE HumanResources.EmployeePayHistory
(    BusinessEntityID int NOT NULL,
RateChangeDate datetime NOT NULL,
Rate money NOT NULL,
PayFrequency tinyint NOT NULL,
ModifiedDate datetime NOT NULL
)

We need an index on RateChangeDate. Let’s see what that should look like:

CREATE NONCLUSTERED INDEX ix_ncl_EmployeePayHistory_RateChangeDate
ON HumanResources.EmployeePayHistory
(    RateChangeDate ASC    )

I like to separate the columns in an index name with underscores. I also like to start each index with ix_cl or ix_ncl depending on it being clustered or non-clustered.

Now what if this were created using underscores?

CREATE TABLE Human_Resources.Employee_Pay_History
(    Business_Entity_ID int NOT NULL,
Rate_Change_Date datetime NOT NULL,
Rate money NOT NULL,
Pay_Frequency tinyint NOT NULL,
Modified_Date datetime NOT NULL
)

The same index would be named as follows:

CREATE NONCLUSTERED INDEX ix_ncl_Employee_Pay_History_Rate_Change_Date
ON Human_Resources.Employee_Pay_History
(    Rate_Change_Date ASC    )

This also applies to constraints and keys (PK, FK, UK).  Assigning a default constraint to a table from the UI would result in something like the following:  DF_Employee_Pay_History_Modified_Date. This just isn’t clean.

In my opinion, CamelCase looks cleaner and as you can see by the index naming convention I use it’s also much easier to parse.

I’m sure that there are a lot of strong opinions on this topic and I’d like to hear them all. Be sure to leave a comment.

Follow me on Twitter and LinkedIn and if you liked this post endorse me.

17 thoughts on “Making the case for CamelCase naming

  1. The conventions were set for the databases I work on before I got to them. I like the convenience of underscores when copying data to Excel to present because I change underscores to blanks in the column headings and then wrap. Excel does nice line breaks in the heading at the blanks. It seems some of our shortest width data have the longest column names which wastes space in most presentations of if such as SSMS and Access. I have written regular expressions to put blanks in for when things become upper case followed by a lower case but this is only mostly effective and Excel doesn’t support regular expressions anyway.

    Like

  2. I was about to comment about it being PascalCase too, but got ninja’ed.

    However, the OP is a brave man for diving into the Format Wars. 🙂 Personally, for table and column names I like spaces as opposed to either concatenated words with caps or underscores, with table/column names in square brackets. I know that’s a very heretical position and comes with its own issues but I like the visual representation that results.

    For *variable names* in stored procedures Pascal case makes a lot of sense since spaces simply aren’t allowed at all and underscores (IMO) should be reserved for constants (or pseudo-constants in T/SQL’s case) which should be entirely upper case.

    True camel case (where the first letter of a name is lower case) is just dumb, again IMO. 🙂

    Interesting article!

    Like

  3. Many Business Intelligence tools such as Tableau automatically convert underscores to spaces and capitalize each word. Their conversion on PascalCase is Notpretty. Also, any attention to capitalization in SQL has to realize that it is treated differently in different DBMS. Many will automatically convert them to lower case in the data dictionary. Also, because capitalization is ignored by the parser many programmers, being lazy, will write their SQL in all upper case or all lower case, which is hard to read if the names were initially in PascalCase. It is better to use underscores so you can have a common naming standard across all DBMS.

    Like

  4. I have always found PascalCase easier to read. I also use Underscores to separate organizing prefixes and suffixes to aid in parsing. Yes kudos to the OP for being brave enough to bring this up and kudos to the posters so far for being civilized. I have seen these discussions melt into gang turf wars that made the crusades look like scout campout weekends. I think it is important to look at readability. Pascal case allows to eyes to flow, even without the spaces present. Underscores are punctuation characters and we are trained in elementary school to mentally break at those characters. Again the mixing the way I explained earlier aids that. The name flows easily in the reading and if an important prefix or suffix has been added, the brain stops to recognize it. Just whatever you do, be consistent.

    Like

  5. Camel/Pascal case is a real pain, particularly since I work on case-sensitive servers. It’s often very difficult to figure out, in advance, how each individual developer decides to capitalize or not. Besides, I honestly don’t see how anyone can say that this:
    LastStockModifiedDate
    is easier to read than:
    last_stock_modified_date
    And, in fact, studies have proven that camel case does slow comprehension of text: camel-cased identifiers take 13.5% to 20% longer to read.

    Quick: Is it ZIPCode or ZipCode? (In case you’re not aware, ZIP *is* an abbreviation.)
    Is it SSNAssignedDate or SsnAssignedDate?
    Is it DOB or Dob?

    Liked by 1 person

    • @Scott, do you have a reference handy to support the stats you mentioned about the relative speed for comprehension of text?

      Like

      • Click to access ICPC2010-CamelCaseUnderScoreClouds.pdf


        In response to the research questions posed in Section II,
        we find that identifier style significantly affects time and
        visual effort needed to correctly detect identifiers constructed
        from a phrase. The underscore style is significantly faster
        and positively influences the dependent variables. In the
        Binkley et al. study [4], Phrase Length did not interact with
        Style, however we find such an interaction in our analysis.
        The common theme in both experiments is that camel-cased
        identifiers take longer than underscored ones (13.5% in the
        previous study and 20% in this study) overall. In the Binkley
        et al. study, a higher accuracy was found for camel-cased
        identifiers, however, in our study, all (except one) subjects
        answered correctly on all questions making accuracy
        comparisons irrelevant.

        In the Binkley et al. study, only one-third of the subjects
        were trained in camel-cased identifiers. In our case, we have
        an equal proportion of experts (8) and novices (7). During
        the demographic briefing, six subjects (40%) stated that they
        preferred camel-case, seven (47%) stated that they preferred
        underscore, and two (13%) had no preference. Also, in the
        Binkley et al. study, non-programmers stated that camel
        casing would be harder to visually process and thus lead to
        more errors. Our results prove this claim to be true with the
        data generated from the eye tracker (AFD(Q), AFD(correct),
        AFD(distracters)).

        Liked by 1 person

  6. As to “Id” vs, “ID”:

    In general, I’m for “Id”.

    But, really, shouldn’t it be “Id” if it means “identifier” but “ID” if it means, say, “Identity document”? Yikes, what a pain!

    Like

  7. Just on a common sense level, how can anyone think that:
    SalesTaxRateForNonGroceries
    is as easy to comprehend as
    sales_tax_rate_for_non_groceries
    ??

    Again, ask 10 developers about Id/ID ZIP/Zip SSN/Sssn DOB/Dob ANSI/Ansi and I’ll bet you get some type of split; no way all of them will agree on all of those.

    If you never work on case sensitive servers, that won’t matter as much to you. But to those who do, it can make a big different in how long it takes to find the correct column name.

    Like

Leave a comment