SQLFluff
Background
SQLFluff is a dialect-flexible and configurable SQL linter. Designed with ELT applications in mind, SQLFluff also works with Jinja templating and dbt. SQLFluff will auto-fix most linting errors, allowing you to focus your time on what matters.
Roles of a linting tool are to enforce cosmetic rules and prevent code smells.
"Code bases without a consistent style are hard to read and work with because of their unpredictable structure.... Beyond “ugly” code, there is another category of “problematic” code. Code that isn’t invalid per say, but might indicate an issue in the logic." - Daniel Mateus Pires, Towards Data Science
Use SQLFluff to enforce these rules.
SQLFLuff rules used in the .sqlfluff config in UCSC dbt project
SQL Style Guide Rule | Rule Code | SQLFluff Fix compatible | SQLFluff Default Configuration |
---|---|---|---|
Use trailing commas | LT04 | Yes | Yes |
Indents should be four spaces (except for predicates, which should line up with the where keyword) | LT02 | Yes | Yes |
Lines of SQL should be no longer than 120 characters | max_line_length | Yes | No |
Use all lowercase unless a specific scenario needs you to do otherwise. This means that keywords, field names, function names, and file names should all be lowercased. | CP01, CP02, CP03 | Yes | No |
Always use the as keyword when aliasing columns, expressions, and tables. | AL01, AL02 | Yes | Yes |
Grouping by column numbers is preferred over grouping by column names/aliases (eg. group by 1, 2) Note that if you are grouping by more than a few columns, it may be worth revisiting your model design. | AM06 | No | No |
Prefer union all to union | AM02 | N/A, (not supported in snowflake) | No |
Avoid using unnecessary table aliases, especially initialisms. It's harder to understand what the table called "c" is compared to "customers". | AL07 | Yes | No |
Be explicit with all join types for example use inner join instead of just join . | AM05 | Yes | No |
When joining multiple tables, always prefix the column names with the table name | RF02 | No | Yes |
Use single quotes for strings. | CV10 | Yes | No |
Don't use using in joins. | ST07 | Yes | No |
Start each CTE on its own line, Use a single blank line around CTEs to add visual separation | LT07, LT08 | Yes | No |
Don't end a line with an operator like and , or , + , || etc | LT03 | Yes | Yes |
If there is only one column expression, put it on the same line as select. If there are multiple column expressions, put each one on its own line (including the first one), indented one level more than select. If there is a distinct qualifier, put it on the same line as select . | LT09, LT10 | Yes | Yes |
Don't put extra spaces inside of parentheses | LT01 | Yes | Yes |
How to use sqlfluff config in your project
please use the versions that are in the requirements.txt e.g:
Install SQLFluff
pip install -r requirements.txt
there is a configured .sqlfluff in the root of the project, which follows the dbt sql style guide as closely as possible (there are some rules that are not able to be fixed)
.sqlfluff, .sqlfluffignore
cli examples:
lint a specific file
sqlfluff lint models/staging/setl/stg_dwh__setl_step_log.sql
lint all files in a directory
sqlfluff lint models/staging/setl
sqlfluff fix a specific file (caution: this will make changes to your file)
sqlfluff fix models/staging/setl/stg_dwh__setl_step_log.sql
sqlfluff fix files in specific directory (caution: this will make changes to all files)
sqlfluff fix models/staging/setl