sqlforge: March 2016

Wednesday, March 30, 2016

Database Health Assessments

Database configuration and maintenance challenges can, if not addressed, cause serious disruptions to database environments and an organization’s ongoing business. A small problem can go unnoticed today but become a larger problem tomorrow that impacts database and company performance, and data security and protection.

KEY AREAS
Backups
Security
Configuration
Maintenance
Disaster recovery

BACKUPS
- Database backups are a standard task and fundamental core responsibility
- Two most common issues that require a database restoration are user errors and hardware failure
- The database is a constantly changing part of a company’s technology landscape. In fact, studies show that a typical database doubles in size every three years. As a database grows, the backup procedures in place do not keep up with the many additions and changes that are made to its structure.
- Point-in-time recovery provides the capability so that in the event of failure, the database can be recovered to the time right before the database failure
- Backups need to be planned in such a way as to make point-in-time recovery possible
- Since most full backups are scheduled at most daily, without point-in-time recovery the organization will lose up to a full day's worth of transactions
- A large number of broken transaction log chains due to missing backup files
- Only one path to restore a database to the most recent PIT (versus multiple full backups with appropriate differential and transaction log backups)
- A lack of multiple backup chains available (even though disk space is inexpensive)
- Corrupt backup files resulting from a lack of continuous testing. Backups are useless until verified.
- Time and experience demonstrate that a database will eventually fail due to a disk subsystem failure, corrupted metadata, upgrade problems, etc. To meet established RTO SLAs, you must have a complete set of backups. In the event of a disaster, your backup strategy and its execution must meet or exceed the expected levels of data retention.
- For example, a mission-critical database will require near zero data loss in a crash scenario in many cases. Under these business parameters, the database must be in full recovery model and have transaction log backups every 5 to 15 minutes (based on SLAs).
- Many databases in FULL recovery model had no transaction log backups occurring. This scenario leads to several major problems
- The transaction log is not truncated and internal log segments are not marked for reuse unless a transaction log backup is taken (for a database in FULL recovery model).
- This leads to an ever-growing transaction log. It is not uncommon to find a database with the transaction log file multiple times larger than the data file — the log at a 100 percent usage level and continuing to grow.
- In addition, point-in-time recovery is not guaranteed since there is no guarantee that the log can be backed up during a crash scenario.
- Other common mistakes include failing to back up system databases and failing to add recently created databases to existing backup maintenance plans.
- Ensure that the current backup strategy meets the business needs on a database by database basis.
- Ensure that databases are in their proper recovery model, and have the proper backup strategy configured, including nightly full/differential backups and appropriate transaction log backups occurring at a frequency that meets required SLAs for data loss.
- Your organization must also ensure that its database backups are protected by designing and implementing off-site replication solutions for those backups, whether it is to an off-site facility or the cloud.

SECURITY

- Security vulnerabilities can go unnoticed since they do not impede regular use of the data. In many environments, if the data can be accurately and reliably accessed, little thought is given to the level of access given to the accounts that use the data or the robustness of the security surrounding those accounts. Security vulnerabilities can account for a variety of database problems, including loss of data, undesired alteration of data, and malicious or unintended modifications to database structures.

- Instance-level logins with weak passwords including accounts with an empty password, the same password as the login name, a one character password, or a password with the reverse of the login name.

- These easily-guessed passwords lead to easy and unintended access into the database environment.

- What is even more damaging is that many of these accounts have elevated permissions on either specific databases or the entire instance.

- Lastly, many environments contained far more sysadmin level accounts than necessary, such as developer logins with full privileges to perform an activity in production instances.

- Remove excess privileges and provide the correct levels of security access needed to perform certain tasks

- Remove write permissions for accounts that only need to run reports. This limits the damage a compromised account can do.

- When changing passwords and limiting security, it is a good time to trace the security needs down to specific objects and rights to ensure a minimized attack surface.

- Properly segregate responsibilities and rights across your environment through the planning of appropriate development, testing, and production deployment strategies

CONFIGURATION
- It is extremely common to find SQL Server instances (and databases) configured with their out-of-box default settings permanently left in that state. Unfortunately, there are performance and security risks associated with the default configuration. When first launched, the performance impact is not noticed. However, as databases grow and their associated transactional activity increases — along with resource needs — the performance penalties of default configurations can severely impact the environment’s operations. Worse yet, the impact of default configurations from a security perspective is immediately detrimental and exposes a company’s environment to inadvertent and malicious attacks.
- Service packs and cumulative updates were not installed — all essential in plugging security holes, fixing program bugs, and enabling additional features to improve the environment’s efficiency and performance.
- Many configuration errors involving SQL memory settings were discovered. At default settings, the SQL Server buffer pool will expand until it causes contention with the operating system and other third-party services. This causes the entire server to slow down, perform unneeded I/O, and sometimes even hang or crash
- Many default settings were found at the database level, which can lead to performance problems such as file growth factors. For example, on a large multi-GB file, the default 10% growth factor can lead to many gigabytes of allocation in a single growth session, which can cause significant I/O pauses.
- Businesses must optimize configuration settings and install security updates to ensure the best performance and bugfree operation of their databases. Following these protocols, your organization’s environment will be better protected and function at a more efficient level.
- Database administrators (DBAs) must develop and execute upgrade plans for standalone instances as well as those that utilize advanced features such as mirroring, Microsoft Cluster Server (MSCS), replication, and Always On Availability Groups.
- Continually manage the reconfiguration of memory and other instance-level settings to ensure optimal use of server resources.
- From a database level, DBAs must make sure that the settings are aligned with the functionality and performance. For example, it is important to determine the proper file growth factors by taking into account the frequency of growth, the size of the growth session, and the time the growth occurs.
- Physical Memory - Often it is found that the infrastructure had adequate memory but the SQL Server instance was not able to utilize it for a variety of reasons. Consider virtualization technology, which is being deployed increasingly in virtualized database environments where extra care and attention is required
- CPU Pressure - When unacceptable CPU pressure occurs, the root cause is either a configuration setting or inefficient TSQL code. Reconfigure instance settings and/or tune TSQL statements to reduce or eliminate CPU pressure and the need to add hardware.
- Disk Layout - Pay special attention to the disk sub-system and how it interacts with Microsoft SQL Server databases, checking to see how it is optimized for performance and redundancy. Also check the physical layout of the databases with special attention given to data and log files for both user and system types of databases.
- Reconfigure instance settings and/or tune TSQL statements
- Consider virtualization technology

MAINTENANCE
- Database maintenance keeps a SQL Server environment running at optimum performance and can stop an impending disaster before it occurs. Maintenance routines act as automated tuning agents that preemptively remedy a host of issues that can occur naturally inside the databases such as index fragmentation, early database corruption detection, and system databases and file system cleanup.
- Missing or improperly configured maintenance plans
- No regular defragmentation is done on database indexes to ensure efficient data access and avoid unnecessary additional I/O. Similar to the defragmentation done on a server or PC hard drive, the same needs to be done on database indexes.
- Regular validation of data pages was not performed to ensure no corruption. Undetected corruption can grow over time. A database may appear to be fully operational while the corruption is being backed up. At some point, the corruption can spread and cause the database to become unusable. Unfortunately when that happens, the problem has already been backed up for awhile. Therefore, it is critical to detect corruption and handle it as soon as possible to eliminate or minimize data loss.
- Implement comprehensive set of best practices maintenance procedures. These procedures ensure database structures, such as indexes and statistics, are regularly defragmented and updated — resulting in the repair of lost performance due to naturally occurring data changes.
- Proper Purging Routines - Each instance keeps a history (e.g. for SQL jobs) that is kept in the MSDB system database and in the file system. It is important that this history is purged or archived on a regular basis to ensure optimal database performance
- Index Maintenance - To speed database queries, it is common practice to add indexes to key tables within the database. As data is added and removed from the table that contains an index, the index itself can be affected. Over time, these indexes can get lopsided and heavily fragmented and severely impact the performance of the database. To ensure optimal performance of a SQL Server database, it is a best practice to perform maintenance on indexes
- Data Integrity Checks - A data integrity failure can start small and then grow over time. If an integrity failure is left uncorrected, it can make its way into database backups. Eventually a database will go suspect or unavailable. At that point, data loss is inevitable and can be potentially quite large. Perform regular data integrity checks
- Perform regular data integrity checks
- Perform maintenance on indexes
- Purge or archive instance histories on a regular basis

DISASTER RECOVERY
- A disaster recovery (DR) plan is crucial in the event of a critical failure at the primary datacenter site and also during small-scale local events, such as extended power outages.
- There is no one-size-fitsall DR solution across such a diverse range of companies because of budget fluctuations and differences in the DR appropriations.
- DR plans could not meet the service level agreements (SLAs) for recovery time objectives (RTO) and/or recovery point objectives (RPO).
- Backups were shipped off-site to a safe location but no facility was in place to restore them or the related instance-level objectives in a timely manner.
- Customers assumed that periodically restoring backups on a remote server would allow for recovery in the event of failure — a process that doesn’t meet the requirements for a complete and usable DR site.
- Many customers failed to successfully test and validate DR procedures and plans.
- can achieve its DR goals by employing a variety of technologies, including log shipping, database mirroring, geo-dispersed clusters, SAN/virtual machine replication, and Availability Groups.
- The DR infrastructure and procedures must be implemented based on best practices and rigorously tested on an on-going basis.
- Each organization must take a tailored approach to effectively balance RTO and RPO needs against the implementation and maintenance costs.
- Consider a cloud service that eliminates the costs associated with the physical hardware and software of an on-premises solution.
- Microsoft lessens the cost burden of disaster recovery and fosters the concept of maintaining a DR site by requiring passive sites to pay licenses only once they become production sites for a certain period of time.

Friday, March 25, 2016

When To Break Down Complex Queries

Four problematic query patterns:

OR logic in the WHERE clause
In this pattern, the condition on each side of the OR operator in the WHERE or JOIN clause evaluates different tables. This can be resolved by use of a UNION operator instead of the OR operator in the WHERE or JOIN clause.

Aggregations in intermediate results sets
This pattern has joins on aggregated data sets, which can result in poor performance. This can be resolved by placing the aggregated intermediate result sets in temporary tables.

A large number of very complex joins
This pattern has a large number of joins, especially joins on ranges, which can result in poor performance because of progressively degrading estimates of cardinality. This can be resolved by breaking down the query and using temporary tables.

A CASE clause in the WHERE or JOIN clause
This pattern has CASE operators in the WHERE or JOIN clauses, which cause poor estimates of cardinality. This can be resolved by breaking down the cases into separate queries and using the Transact-SQL IF statement to direct the flow for the conditions.

https://blogs.msdn.microsoft.com/sqlcat/2013/09/09/when-to-break-down-complex-queries/

Default Trace

Often overlooked, the default trace is there with some useful audit info should you need it.

--see if default trace is enabled
SELECT *
FROM sys.configurations
WHERE configuration_id = 1568

--list events that are captured
SELECT distinct E.name "EventName"
FROM fn_trace_geteventinfo(1) I
INNER JOIN sts.trace_events E
ON I.eventid = E.trace_event_id

--path to trace file
SELECT reverse(substring(reverse(path), charindex('\',reverse(path)), 256)) "default_tracepath"
FROM sys.traces
WHERE is_default = 1

--query trace file
SELECT *
FROM sys.fn_trace_gettable(convert(varchar(150),
( SELECT top 1
FROM sys.fn_trace_getinfo(NULL) f
WHERE f.property = 2 )), default ) T
INNER JOIN sys.trace_events E
ON T.eventclass = E.trace_event_id

Wednesday, March 23, 2016

SET STATISTICS IO

SET STATISTICS IO

Displays the amount of disk activity that was generated by a query.

Meaning of output:

• Logical reads—Number of pages read from the buffer pool.

• Physical reads—Number of pages read from disk.

• Read-ahead read— Read-ahead is a performance optimization mechanism that anticipates the needed data pages and reads them from disk. It can read up to 64 contiguous pages from one data file.

• Lob logical reads—Number of large object (LOB) pages read from the buffer pool.

• Lob physical reads—Number of LOB pages read from disk.

• Lob read-ahead reads—Number of LOB pages read from disk using the read-ahead mechanism.

DiskSpeed

GUI for DiskSpeed

https://gallery.technet.microsoft.com/DiskSpeed-90d37d55

Monday, March 21, 2016

Best practices for Virtualizing SQL Server 2012

Best practices for Virtualizing SQL Server 2012

http://download.microsoft.com/download/6/1/d/61dde9b6-ab46-48ca-8380-d7714c9cb1ab/best_practices_for_virtualizing_and_managing_sql_server_2012.pdf

Detrimental effects of predicates on estimation of cardinality with the query processor.

• Queries with predicates that use comparison operators between different columns of the same table.

• Queries with predicates that use operators, and any one of the following are true:
o There are no statistics on the columns involved on either side of the operators.
o The distribution of values in the statistics is not uniform, but the query seeks a highly selective value set. This situation can be especially true if the operator is anything other than the equality (=) operator.
o The predicate uses the not equal to (!=) comparison operator or the NOT logical operator.

• Queries that use any of the SQL Server built-in functions or a scalar-valued, user-defined function whose argument is not a constant value.

• Queries that involve joining columns through arithmetic or string concatenation operators.

• Queries that compare variables whose values are not known when the query is compiled and optimized.

table variable myths

Table variables are not "in memory" tables, they function basically the same as #temporary tables and even create objects in tempdb.

The difference is that table variables are "lightweight" temporary tables in that they exist only for the scope of the batch and there are no statistics associated with them (the optimizer assumes only one row in table). Also, you cannot create indexes on them (aside from primary key and unique constraints).

So, table variables can improve performance and take up less resources in situations with small data sets, however larger data sets can have better performance as temporary tables.

SQL SERVER 2014 DATA TOOLS

SQL SERVER 2014 DATA TOOLS

For some reason as yet unknown Microsoft decided to stop shipping SQL Server Data Tools with the installation media. Adding to the confusion is that the tools are now split into different versions.

There is one version named just SQL Server Data Tools (SSDT), which is an add-on to Visual Studio. This version adds the toolset that was formerly known as Data Dude to Visual Studio. These SSDT tools include data and schema comparison, as well as support for refactoring databases, creating views, stored procedures, functions, and triggers. This version of SSDT provides both a live development mode and an offline project mode that helps provide source control for your databases by integrating with Visual Studio for team development, source control, and change tracking. There are different versions of these SSDT tools depending on the version of Visual Studio that you are using.

• SQL Server tooling in Visual Studio 2013—Visual Studio 2013 Express for Web, Express for Windows Desktop, Professional, Premium, and Ultimate include the SSDT tools. You don’t need a separate download. To check for the latest version of SSDT, open Visual Studio 2013 and choose the Tools, Extensions, and Updates menu. Then check the Updates section for Microsoft SQL Server Update for database tooling.

• SSDT Visual Studio 2012—there is a standalone install experience as well as an integrated install for the Visual Studio Professional, Premium, and Ultimate SKUs.

• SSDT Visual Studio 2010—this version of SSDT is no longer being updated.
The other version of SSDT is confusingly called SQL Server Data Tools – Business Intelligence (SSDT-BI). Although their names are almost identical, SSDT-BI is a distinctly different toolset than SSDT. SSDT-BI is the replacement for BIDS and it enables the development of Integration Services packages, Analysis Services cubes, and Reporting Services reports. Both versions of SSDT are no cost downloads for SQL Server users.

Source: 55144BC SQL SERVER PERFORMANCE TUNING AND OPTIMIZATION

Tuesday, March 15, 2016

i am the very model of a modern data professional

I am the very model of a modern Data-Professional,
I've information vegetable, animal, and mineral,
I know the kings of England, and I quote the fights historical
From Marathon to Waterloo, in order categorical;
I'm very well acquainted, too, with matters mathematical,
I understand equations, both the simple and quadratical,
About binomial theorem I'm teeming with a lot o' news,
With many cheerful facts about the square of the hypotenuse.
I'm very good at integral and differential calculus;
I know the scientific names of beings animalculous:
In short, in matters vegetable, animal, and mineral,
I am the very model of a modern Data-Professional.

I know our mythic history, King Arthur's and Sir Caradoc's;
I answer hard acrostics, I've a pretty taste for paradox,
I quote in elegiacs all the crimes of Heliogabalus,
In conics I can floor peculiarities parabolous;
I can tell undoubted Raphaels from Gerard Dows and Zoffanies,
I know the croaking chorus from The Frogs of Aristophanes!
Then I can hum a fugue of which I've heard the music's din afore,
And whistle all the airs from that infernal nonsense Pinafore.
Then I can write a washing bill in Babylonic cuneiform,
And tell you ev'ry detail of Caractacus's uniform:
In short, in matters vegetable, animal, and mineral,
I am the very model of a modern Data-Professional.

In fact, when I know what is meant by "mamelon" and "ravelin",
When I can tell at sight a Mauser rifle from a javelin,d
When such affairs as sorties and surprises I'm more wary at,
And when I know precisely what is meant by "commissariat",
When I have learnt what progress has been made in modern gunnery,
When I know more of tactics than a novice in a nunnery –
In short, when I've a smattering of elemental strategy –

You'll say a better Data-Professional has never sat a gee-e
For my military knowledge, though I'm plucky and adventury,
Has only been brought down to the beginning of the century;
But still, in matters vegetable, animal, and mineral,
I am the very model of a modern Data-Professional.