tcpdump -vvv -i eth0 -s 0 port bootps
debug
All posts tagged debug
Make sure that grub is looking at the correct location for the kernel. Â I just had a situation where I copied a VM grub menu which was using LVM to a physical host on accident , when I rebooted grub errored out. Â This is a work in progress..
Drop to grub command line.
esc
type ‘c’
grub> Â find /boot/vmlinuz
(hd0,2)
hit esc
select your correct kernel in grub, type ‘e’
Verify your config is pointing to correct partition, if not as was the case for me, change to whatever find found and hit enter, then type ‘b’ to boot.
askmethod
Do not automatically use the CD-ROM as the install source if we detect a Red Hat Enterprise Linux CD in your CD-ROM drive.
autostep
Make kickstart non-interactive.
debug
Start up pdb immediately.
dd
Use a driver disk.
dhcpclass=
Sends a custom DHCP vendor class identifier. ISC’s dhcpcd can inspect this value using “option vendor-class-identifier”.
dns=
Comma separated list of nameservers to use for a network installation.
driverdisk
Same as ‘dd’.
expert
Turns on special features:
allows partitioning of removable media
prompts for a driver disk
gateway=
Gateway to use for a network installation.
graphical
Force graphical install. Required to have ftp/http use GUI.
isa
Prompt user for ISA devices configuration.
ip=
IP to use for a network installation, use ‘dhcp’ for DHCP.
keymap=
Keyboard layout to use. Valid values are those which can be used for the ‘keyboard’ kickstart command.
ks=nfs:
ks=http://
ks=floppy
The installation program looks for the file ks.cfg on a vfat or ext2 file system on the diskette in /dev/fd0.
ks=floppy:/
ks=hd:
The installation program mounts the file system on
ks=file:/
The installation program tries to read the file
ks=cdrom:/
ks
If ks is used alone, the installation program configures the Ethernet card to use DHCP. The kickstart file is read from the “bootServer” from the DHCP response as if it is an NFS server sharing the kickstart file. By default, the bootServer is the same as the DHCP server. The name of the kickstart file is one of the following:
If DHCP is specified and the boot file begins with a /, the boot file provided by DHCP is looked for on the NFS server.
If DHCP is specified and the boot file begins with something other than a /, the boot file provided by DHCP is looked for in the /kickstart directory on the NFS server.
If DHCP did not specify a boot file, then the installation program tries to read the file /kickstart/1.2.3.4-kickstart, where 1.2.3.4 is the numeric IP address of the machine being installed.
ksdevice=
The installation program uses this network device to connect to the network. For example, to start a kickstart installation with the kickstart file on an NFS server that is connected to the system through the eth1 device, use the command ks=nfs:
kssendmac
Adds HTTP headers to ks=http:// request that can be helpful for provisioning systems. Includes MAC address of all nics in CGI environment variables of the form: “X-RHN-Provisioning-MAC-0: eth0 01:23:45:67:89:ab”.
lang=
Language to use for the installation. This should be a language which is valid to be used with the ‘lang’ kickstart command.
loglevel=
Set the minimum level required for messages to be logged. Values for
lowres
Force GUI installer to run at 640×480.
mediacheck
Activates loader code to give user option of testing integrity of install source (if an ISO-based method).
method=cdrom
Do a CDROM based installation.
method=ftp://
method=hd://
method=http://
method=nfs:
netmask=
Netmask to use for a network installation.
nofallback
If GUI fails exit.
nofb
Do not load the VGA16 framebuffer required for doing text-mode installation in some languages.
nofirewire
Do not load support for firewire devices.
noipv6
Disable IPv6 networking during installation.
nokill
A debugging option that prevents anaconda from terminating all running programs when a fatal error occurs.
nomount
Don’t automatically mount any installed Linux partitions in rescue mode.
nonet
Do not auto-probe network devices.
noparport
Do not attempt to load support for parallel ports.
nopass
Don’t pass keyboard/mouse info to stage 2 installer, good for testing keyboard and mouse config screens in stage2 installer during network installs.
nopcmcia
Ignore PCMCIA controller in system.
noprobe
Do not attempt to detect hw, prompts user instead.
noshell
Do not put a shell on tty2 during install.
nostorage
Do not auto-probe storage devices (SCSI, IDE, RAID).
nousb
Do not load USB support (helps if install hangs early sometimes).
nousbstorage
Do not load usbstorage module in loader. May help with device ordering on SCSI systems.
rescue
Run rescue environment.
resolution=
Run installer in mode specified, ‘1024×768’ for example.
serial
Turns on serial console support.
skipddc
Skips DDC probe of monitor, may help if it’s hanging system.
syslog=
Once installation is up and running, send log messages to the syslog process on
text
Force text mode install.
updates
Prompt for floppy containing updates (bug fixes).
updates=ftp://
updates=http://
upgradeany
Don’t require an /etc/redhat-release that matches the expected syntax to upgrade.
vnc
Enable vnc-based installation. You will need to connect to the machine using a vnc client application.
vncconnect=
Once installation is up and running, connect to the vnc client named
Requires ‘vnc’ option to be specified as well.
vncpassword=
Requires ‘vnc’ option to be specified as well.
Found this on one of my Fedora boxes after noticing it was pretty sluggish on bootup. Tool is called systemd-analyze.
$ sudo systemd-analyze
Startup finished in 3405ms (kernel) + 3034ms (initramfs) + 141900ms (userspace) = 148341ms
Find the culprits to your slow ass startup. Hello sendmail..
$ sudo systemd-analyze blame | head -7
60127ms sm-client.service
60058ms sendmail.service
13492ms fedora-loadmodules.service
8489ms fedora-wait-storage.service
5202ms remount-rootfs.service
3176ms systemd-vconsole-setup.service
3127ms fedora-storage-init.service
There’s a bunch more you can do with systemd. You can plot out your boot up times, can mask off services so they don’t startup, look at kernel times etc.
Might continue more on this another time..
Jephe Wu – http://linuxtechres.blogspot.com
Environment: MS SQL server 2008 R2, Windows server 2008
Objective: SQL server performance monitoring and tuning, database backup and restore procedures
Steps:
Configuration Changes:
According to – How to determine proper SQL Server configuration settings http://support.microsoft.com/kb/319942
SQL Server can obtain a very high level of performance with relatively little configuration tuning. You can obtain high levels of performance by using good application and database design, and not by extensive configuration tuning. See the “References” section of this article for information about how to troubleshoot various SQL Server performance issues.
When you address a performance problem, the degree of improvement that is available from configuration tuning is typically modest unless you do not currently have the system properly configured. In SQL Server version 7.0 and later, SQL Server uses automatic configuration tuning and it is extremely rare that configuration settings (especially advanced settings) need any changes. Generally, do not make a SQL Server configuration change without overwhelming reason and not without careful methodical testing to verify the need for the configuration change. You must establish a baseline before the configuration change so that you can measure the benefit after the change.
If you do not have SQL Server properly configured, some settings might de-stabilize the server or might make SQL Server behave erratically. Years of support experience with many different environments indicate that non-default configuration settings might have results that range from neutral to highly negative.
If you do make a configuration change, you must perform rigorous methodical performance testing both before and after the change to assess the degree of improvement.
Based on actual support scenarios, SQL Server version 7.0 and later can achieve an extremely high level of performance without any manual configuration tuning.
In SQL Server version 7.0 and later, do not make any configuration changes to user connections, locks, and open objects because, by default, SQL Server dynamically tunes these settings.
Part I: Performance monitoring:
Performance Counter threshold value:
http://www.grumpyolddba.co.uk/monitoring/Performance%20Counter%20Guidance%20-%20SQL%20Server.htm
http://www.grumpyolddba.co.uk/monitoring/Performance%20Counter%20Guidance%20-%20Windows%20Server.htm
http://www.grumpyolddba.co.uk/sql2000/KB702_SQL%20Performance%20Tuning%20using%20Waits%20and%20Queues.mht
http://msdn.microsoft.com/en-us/library/ms189628.aspx – performance objects explanation
1. overview
check CPU and memory first, then disk I/O, lastly network.
because memory problem can trigger disk I/O problem.
then check database internal performance counters, server minimum/maximum memory configuration, data/log files, tempdb data/log files, paging files.
2. CPU bottleneck check
a. firstly check task manager to see if any other application is using most of cpu, rather than sql server. If it’s sqlserver, then check further
b. run ‘perfmon’, add performance counter
processor information – % Processor time the percentage of elapsed time that all of process threads used the processor to execution instructions should <80%. Process is the object created when you run a program; thread is object that spawned by progress to be used to execute instruction.
Processor - % user time should be > 70%
Processor – % privileges time – The operating system switches application threads to privileged mode to access operating system services. should be < 20%
SQL Server runs in User mode. Privileged mode, is designed for operating system components and allows direct access to hardware and all memory.
System - Processor Queue Length - Number of threads waiting to be scheduled for CPU time
System:Processor Queue Length is the processor queue for all CPUs on a Windows system. If System: Processor Queue Length is greater than two for each CPU, it may indicates a CPU bottleneck
Some caused for high processor queue length:
1. Unnecessary compilation and recompilation.
2. memory pressure
3. lack of proper indexing
System - Context Switches/sec - if it's greater than 20,000 per second, that's a lot.
sys.dm_os_wait_stats for Signal waits and Total waits, if Signal waits > 25% of total waits indicates cpu bottleneck
Note: to counter the number of cpu in sql server:
select cpu_count from sys.dm_os_sys_info
· SQL Server: SQL Statistics: Batch Requests/sec the number of batch requests that SQL Server receives per second, indicates how busy you sql server are. This is relative number, depending your CPU power and how busy your database is.
a 100M ethernet can roughly handle 1000 batch request/sec
· SQL Server: SQL Statistics: SQL Compilations/sec (should be as lower as possible < 100 maybe)
· SQL Server: SQL Statistics: SQL Recompilations/sec (better to be nearly 0)
Measurement:
sql recompilation/sec : sql compilation/sec should be < 10%.
sql compilation/sec : batch request/sec should be < 10%
3. Memory bottleneck check
The default setting for min server memory is 0, and the default setting for max server memory is 2147483647 MB in MS SQL 2008
run 'perfmon', check the following:
SQL server: Buffer Manager - Average Page Life Expectancy Counter - should not less than 300 (5mins)
How long can a page stay in the memory before being flushed out to disk. too lower indicates lack of memory or missing index.(full table scan)
SQL server: Memory Manager - memory grant pending - task waiting for RAM to execute.
Memory: - paging file % usage - should be as lower as possible, < 70% based on KB 889654
Memory: - paging file % usage peak - should be as lower as possible, < 70% based on KB 889654
Memory: - Available MBytes - should not be less than 200
Memory: - Page fault /s
Page fault /s includes both hard faults (those that require disk access) and soft faults (where the faulted page is found elsewhere in physical memory.) Most processors can handle large numbers of soft faults without significant consequence. However, hard faults, which require disk access, can cause significant delays
Memory: - pages/sec - indicator of the amount of paging that Windows is performing
Memory: - Page Reads/sec (should <=5)
This counter is probably the best indicator of a memory shortage because it indicates how often the system is reading from disk because of hard page faults. The system is always using the pagefile even if there is enough RAM to support all of the applications. Thus, some number of page reads will always be encountered. However, a sustained value over 5 Page Reads/sec is often a strong indicator of a memory shortage
If Memory: Pages/sec is greater than zero or Memory: Page Reads/sec is greater than five, Windows is using disk to resolve memory references (hard page fault). This costs disk I/O + CPU resources.
Note:
a. If you have less memory, the disk I/O will increase, they are related, you should consider both sides if you suspect your system has performance issue.
4. Disk I/O bottleneck
We should concentrate on disk queuing instead of on the actual I/O for each disk as actual disk I/O speed cannot be adjusted.
In perfmon,
PhysicalDisk - avg. Disk Queue Length (if exceeds twice the number of spindles, then you are likely developing a bottleneck, you have to know how many disk behind the system, if it's SAN system, it might be difficult to get that information, even SAN administrator might not know exactly)
Note: Disk queuing might be caused by short of memory, check pages/sec value also.
Physical disk - avg. disk seconds/read - indicates the average time, in seconds, of a read of data from the disk. (4-8ms is ideal, if > 15ms might indicate disk bottlenecks
refer to http://technet.microsoft.com/en-US/library/cc917690 for IO bottleneck part
Less than 10 ms – very good
Between 10 – 20 ms – okay
Between 20 – 50 ms – slow, needs attention
Greater than 50 ms – Serious I/O bottleneck
According to http://download.microsoft.com/download/F/1/0/F10BC023-9396-4D67-BAC9-7C43AE954BF4/SQLServer2000_SQL_Performance_Tuning_using_Waits_and_Queues.pdf
If disk sec/read > normal read time (ask vendor for normal read time) you can consider the following options:
1. Resolve IO bottleneck by adding more drives; spreading IO across new drives if possible e.g. move files such as database, transaction log, other application files that are being written to or read from.
2. Check for memory pressure – see memory component.
3. Check for proper indexing of SQL tables. Proper indexing can save IO. Check SQL query plans looking for scans and sorts, etc. Showplan identifies sorting steps.
4. Run SQL Profiler to identify TSQL statements doing scans. In Profiler, select the scans event class & scan stopped event. Go to the data column tab and add object Id. Run the trace. Save the profiler trace to a trace table, and then search for the scans event. Alternately, you can search for high duration, reads, and writes.
Physical disk – avg. disk seconds/write (< 8ms (non cached) and <=1ms for highend SAN) The throughput for high volume OLTP applications is dependent on fast sequential transaction log writes. Non cached Writes Excellent < 08 Msec ( .008 seconds ) Good < 12 Msec ( .012 seconds ) Fair < 20 Msec ( .020 seconds ) Poor > 20 Msec ( .020 seconds )
Cached Writes Only
Excellent < 01 Msec ( .001 seconds )
Good < 02 Msec ( .002 seconds )
Fair < 04 Msec ( .004 seconds )
Poor > 04 Msec ( .004 seconds
Physical disk: % disk time – should be < 50% - percentage of elapsed time that the selected disk drive was busy servicing read or write requests. Physical disk - avg. disk reads/s and Physical disk - avg. disk write/s Note: this 2 values indicate how busy is your database server, just for measuring purpose to know how busy your system is , compare to 3 month or a year ago Should maintain above 2 values < 85% disk capacity Example: When you use these counters, you may need to adjust the values for RAID configurations using the following formulas: · Raid 0 -- I/Os per disk = (reads + writes) / number of disks · Raid 1 -- I/Os per disk = [reads + (2 * writes)] / 2 · Raid 5 -- I/Os per disk = [reads + (4 * writes)] / number of disks · Raid 10 -- I/Os per disk = [reads + (2 * writes)] / number of disks For example, you might have a RAID-1 system with two physical disks with the following values of the counters. Disk Reads/sec 80 Disk Writes/sec 70 Avg. Disk Queue Length 5 Then to maintain a raid configuration below, you need: Raid 0 - need 75 IO per disk Raid 1 - need 110 IO per disk Raid 5 - need 180 IO per disk Raid 10 - need 110 IO per disk The method of watching out for disk queuing associated with SQL Server log files is different from SQL Server database files. For log file, we use SQL Server: database: log flush wait time SQL server: database: log flush waits/sec c. raid level - use raid 10 if possible Each SQL Server write to the mirrorset results in two disk I/O operations When dynamiclly adding new disk into raid array, The RAID controller will move some existing SQL Server data to these new drives so data is evenly distributed across all drives in the RAID array d. sequential I/O sequentional I/O is faster than non-sequential one, most of hard disk perform 2 times better for sequential than non-sequential I/O operations Note: Logs generally are not a major concern because transaction log data is always written sequentially to the log file in sizes ranging up to 32 KB. 5. SQL Server See SQL Server, Buffer Manager Object - http://msdn.microsoft.com/en-us/library/ms189628.aspx for various objects explanation SQL Server: Buffer Manager: Buffer Cache Hit Ratio, should be at least 90% Note: a major part of any database server environment is the management of memory buffer cache. less than 90% may indicate memory pressure or missing index. See SQL Profiler: Stored Procedure: CacheHit, CacheMiss, and CacheInsert to see what stored procedure query plans are already in cache (Hit), vs. those not in cache (Miss,Insert) SQL server: cache manager: Cache Hit Ratio - Percentage of time the procedure plan pages are already in cache e.g. procedure cache hits. I.e. how often a compiled procedure is found in the procedure cache (thus avoiding the need to recompile). SQL server: Buffer Manager: Page Life expectancy - Sudden big drop in page life expectancy - DW applications (e.g. big transactions) could experience big drops in page life expectancy. This is due to a cache flush from a big read or average page life expectancy less than 300 which is 5 minutes - OLTP database - indicate memory pressure, missing indexes, or a cache flush. SQL server: access method : Forwarded Records/sec Number of records fetched through forwarded record pointers. Tables with NO clustered index. If you start out with a short row, and update the row creating a wider row, the row may no longer fit on the data page. A pointer will be put in its place and the row will be forwarded to another page. Look at code to determine where the short row is inserted followed by an update. Can be avoided by: 1. Using Default values (so that an update will not result in a longer row that is the root cause of forwarded records). 2. Using Char instead of varchar (fixes length so that an update will not result in a longer row SQL Server: access method: Full Scan/sec - The number of unrestricted full scans. These can either be base table or full index scans. - should be as lower as possible > 1 might indicate problem
SQL Profiler can be used to identify which TSQL statements do scan. Select the scans event class & events scan:started and scan:completed. Include the object Id data column. Save the profiler trace to a trace table, and then search for the scans event.
The scan:completed event will provide associated IO so you can also search for high reads, writes, and duration.
MS SQL Server: Access method: index search/s (ideally index search/s : full scan /s should > 1000)
Index searches are used to start range scans, single index record fetches, and to reposition within an index. Compare to Full Scan/sec. You want to see high values for index searches.
MS SQL Server: Access method: Page Splits/sec – Number of page splits occurring as the result of index pages overflowing. Normally associated with leaf pages of clustered indexes and non-clustered indexes.
Page splits are extra IO overhead that results from random inserts.
When there is no room on a data page, and the row must be inserted on the page (due to index order), SQL will split the page moving half the rows to a new page, and then insert the new row.
Correlate to Disk: page sec/write. If this is very high, you may reorg the index(es) on the table(s) causing the page splits, to reduce page splits temporarily. Fillfactor will leave a certain amount of space available for inserts. see http://msdn.microsoft.com/en-us/library/aa933139(v=sql.80).aspx for what’s fillfactor.
MS SQL Server: Buffer Manager: Checkpoint pages/sec – Pages written to disk during the checkpoint process, freeing up SQL cache
Indicates the number of pages flushed to disk per second by a checkpoint or other operation that require all dirty pages to be flushed.
Memory pressure is indicated if this counters is high along with high lazy writes/sec and low page life expectancy (<300 seconds)
MS SQL Server: Buffer Manager: Lazy writes/sec - Pages written to disk by the lazywriter, freeing up SQL cache
Indicates the number of buffers written per second by the buffer manager's lazy writer. The lazy writer is a system process that flushes out batches of dirty, aged buffers (buffers that contain changes that must be written back to disk before the buffer can be reused for a different page) and makes them available to user processes. The lazy writer eliminates the need to perform frequent checkpoints in order to create available buffers.
Memory pressure is indicated if above this counters is high along with high lazy writes/sec and low page life expectancy (<300 seconds)
MS SQL Server: Buffer Manager: Readahead pages/sec - indicates the number of pages read per second in anticipation of use.
If memory shortages, cold cache, or low hit rates, SQL may use worker threads to readahead (bring in pages ahead of time) to raise hit rates. By itself readahead is not a problem unless users are flushing each other’s pages consistently.
Check for proper indexing and bad query plans (scans in profiler)
SQL Server: database: log flush wait time (waiting for transaction log write request to complete
and SQL server: database: log growths (Windows will automatically grow transaction log to accommodate insert, update, and delete activity.)
In general, growths of the transaction log will temporarily freeze writes to the transaction log while Windows grows the transaction log file. Check to see that the growth increment is large enough. If not, performance will suffer as log growths will occur more often.
SQL Server: log file size and used log file size
SQL Server: database file size
SQL server: tempdb file size
SQL server: database: Transactions /sec – SQL Server transactions per second
MS SQL server: Transactions – Longest Transaction Running time.
check long transaction records view
sys.dm_tran_database_transactions – columns of particular interest include the time of the first log record (database_transaction_begin_time), the current state of the transaction (database_transaction_state), and the log sequence number (LSN) of the begin record in the transaction log (database_transaction_begin_lsn).
In perfmon, check
General statistics: login/s (number of logins per second), logout/s
General statistics: User Connections : not very accurate since some application uses connection pool, there are many users behind connection pool
Latches : Average Latch Wait Time(ms)
Latches are short term light weight synchronization object. Latches are not held for the duration of a transaction. Typical latching operations during row transfers to memory, controlling modifications to row offset table, etc
Latches: Latch Waits/sec
Lock – Average Wait Time(ms) Transactions should be as short as possible to limit the blocking of other users.
Lock – Lock Waits/sec
Lock – number of deadlocks/sec
sys.dm_os_wait_stats – check top wait statistics
sys.dm_db_index_operational_stats – check lock and latch wait
8. Network bottlenecks
in perfmon, choose the following:
a. Network Interface – Current Bandwidth.
b. packets/sec
Note: actual bandwidth is computed as packets/sec * 1500 * 8 /1000000 Mbps.
9. tempdb
Also check paging file, separate data file and log file drive, tempdb data/log should be at separate drive.
Monitor the following Performance Monitor counters for any unusual increase in the temporary objects allocation/deal location activity:
· SQL Server:Access Methods\Workfiles Created /Sec
· SQL Server:Access Methods\Worktables Created /Sec
· SQL Server:Access Methods\Mixed Page Allocations /Sec
· SQL Server:General Statistics\Temp Tables Created /Sec
· SQL Server:General Statistics\Temp Tables for destruction
Solution: Increase the tempdb data files by an equal amount to distribute the workload across all of the disks and files. Ideally, you want to have as many files as there are CPUs (taking into account the affinity).
Move tempdb data and logfile to new location:
alter database tempdb modify file
(name=’tempdev’,filename= ‘e:\mssql7\tempnew_location.mDF’)
alter database tempdb modify file
(name=’templog’,filename= ‘c:\temp\tempnew_loglocation.mDF’)
Part II – performance tuning
1. CPU/Memory bottlenecks – upgrade server or adding more RAM
Note:
AWE: check it on 32-bit servers with >4GB of memory, and unchecked the rest of the time.
2. reduce disk I/O
buy faster disk (15k rpm)
using database partition to reduce I/O
separate data and log file disk
increase temp tablespace initial size and put to separate disk other than data/log files
configuring suitable maximum server memory in sql server to leave some for OS
suitable raid level:raid 0 or raid 0+1(raid 1/0 or raid10) , mirrored stripes (ms) , stripe first, mirror later, raid 0 first. Raid 10 is the bestfiles.
3. MS SQL server
create and maintain good index
recently used index is in sys.dm_db_index_usage_stats, any defined index not included in this DMV has not been used since the last re-start of SQL Server.
Too many indexes need frequent updates or unused indexes that need frequently maintenance
(insert/select/update) will create overhead, but without benefiting any users.
create appropriate indexing to avoid excessive sorting operations
Big IOs such as table and range scans due to missing indexes
Reduce table join for frequent queries. join overuse will result in long run query and waste resources.
Partition large data sets and indexes
Tune applications and queries.
It is very important for application developers to understand SQL Server architectural basics and how to take full advantage of SQL Server indexes to minimize I/O.
Lack of useful statistics:
http://msdn.microsoft.com/en-us/library/ms187348(v=sql.105).aspx
http://msdn.microsoft.com/en-us/library/ms190397(v=sql.105).aspx#UpdateStatistics
the following command calls sp_updatestats to update all statistics for the database.
EXEC sp_updatestats
other usage:
UPDATE STATISTICS Sales.SalesOrderDetail;
DBCC SHOW_STATISTICS
AUTO_CREATE_STATISTICS
AUTO_UPDATE_STATISTICS
AUTO_UPDATE_STATISTICS_ASYNC
Master database files:
The master database, msdb, and model databases are not used much during production compared to user databases, so it is typically not necessary to consider them in I/O performance tuning considerations. The master database is usually used only for adding new logins, databases, devices, and other system objects.
Part III – References:
1. http://sqlcat.com/sqlcat/b/top10lists/archive/2007/11/21/top-sql-server-2005-performance-issues-for-oltp-applications.aspx
2. SQL server performance tuning guide for dataware house – http://technet.microsoft.com/library/Cc966420
3. http://msdn.microsoft.com/en-us/library/ms178067.aspx
4. http://sqlcat.com/sqlcat/b/top10lists/archive/2007/11/21/top-10-sql-server-2005-performance-issues-for-data-warehouse-and-reporting-applications.aspx
5. sql long transaction view – http://msdn.microsoft.com/en-us/library/ms186957(v=sql.105).aspx
6. update statistics – http://msdn.microsoft.com/en-us/library/ms187348(v=sql.105).aspx
7. Performance Monitor Counters – http://technet.microsoft.com/en-us/library/cc768048.aspx
8. performance and tuning howto – http://msdn.microsoft.com/en-us/library/ms191511(v=sql.105).aspx
9. http://technet.microsoft.com/en-us/library/ms190619(v=sql.105)
10. establishing a performance baseline – http://technet.microsoft.com/en-us/library/cc781394(v=WS.10).aspx
11. storage best practise – http://technet.microsoft.com/library/Cc966534
12. Troubleshooting performance problem in sql server 2008 – http://msdn.microsoft.com/en-us/library/dd672789(v=SQL.100).aspx#phrss – excellent doc
13. best practice for sql sever maintenance – http://technet.microsoft.com/en-US/library/cc966447
14. SAN storage best practise for sql server – http://www.brentozar.com/sql/sql-server-san-best-practices/
15. sql server 2008 I/O performance – http://blogs.technet.com/b/josebda/archive/2009/03/31/sql-server-2008-i-o-performance.aspx
16. diagnosing and resolving Latch Content for SQL server 2008 R2 – http://www.microsoft.com/en-us/download/details.aspx?id=26665
17. How to: Configure SQL Server to Use Soft-NUMAï¼ http://msdn.microsoft.com/en-us/library/ms178144(v=sql.105).aspx
Part IV – useful DMVs
1. DMV list: http://msdn.microsoft.com/en-us/library/ms179984.aspx
2. commands:
select * from sys.dm_os_wait_stats order by wait_time_ms
select * from sys.dm_os_latch_stats;
Select wait_type,
waiting_tasks_count,
wait_time_ms
from sys.dm_os_wait_stats
where wait_type like ‘PAGEIOLATCH%’
order by wait_type
I/O wait at this moment
select
database_id,
file_id,
io_stall,
io_pending_ms_ticks,
scheduler_address
from sys.dm_io_virtual_file_stats(NULL, NULL)t1,
sys.dm_io_pending_io_requests as t2
where t1.file_handle = t2.io_handle
top 10 wait type
select top 10 *
from sys.dm_os_wait_stats
order by wait_time_ms desc
Part V – best practise
1. Isolate log from data at the physical disk level –
separate database data and log file disk, Pre-size data and log files.
According to http://technet.microsoft.com/library/Cc966534, It is recommended to have .25 to 1 data files (per filegroup) for each CPU on the host server.
tempdb data/logfile should on different disk spindle
create one data file for tempdb per cpu (Dual core counts as 2 CPUs; logical procs (hyperthreading) do not)
depending on tempdb usage, use raid10 for storage if possible
backup should go to separate disk
If you have real time anti-virus monitoring, it is recommended that you exclude the SQL Server database files (including data files, transaction log files, tempdb and other system database files) from real time monitoring
Update statistics on the largest tables weekly or monthly
Rebuild or defrag the most important indexes
2. storage disk for data file should not be shared with other application etc, dedicated for data file with enough spindle
3. vcpu and memory amount should be same as physical machine baseline if running in virtual environment.
4. If raid 1/0 or raid 1 for logfile disk if possible. RAID 1+0 provides better write performance than any other RAID level providing data protection, including RAID 5.
Part VI – testing tools
1. sql nexus
2. sql diag
3. sqlIO , alternatively , use IOMETER
download sqlio at http://www.microsoft.com/en-us/download/details.aspx?id=20163
http://www.brentozar.com/archive/2008/09/finding-your-san-bottlenecks-with-sqlio/
SAN performance tuning with sqlio – http://sqlserverpedia.com/wiki/SAN_Performance_Tuning_with_SQLIO
4. perfmon and sql profiler
http://www.simple-talk.com/sql/database-administration/correlating-sql-server-profiler-with-performance-monitor/ good article talking about how to corelate profiler and perfmon data
5. latch wait
For a non-production environment only, clear the sys.dm_os_wait_stats DMV with
the following command:
dbcc SQLPERF (‘sys.dm_os_wait_stats’, ‘CLEAR’)
A similar command can be run to clear the sys.dm_os_latch_stats DMV:
dbcc SQLPERF (‘sys.dm_os_latch_stats’, ‘CLEAR’)
Part VII – backup and restore
1. restore master datbase in single user mode – http://technet.microsoft.com/en-us/library/ms190679.aspx
2. Performing a Complete Database Restore (Full Recovery Model)
http://msdn.microsoft.com/en-us/library/ms187495(v=sql.105).aspx
http://msdn.microsoft.com/en-us/library/ms175199(v=sql.105).aspx
Part VIII – FAQ
1. if performance counter doesn’t load or corrupted, you can run this to fix:
cd \windows\system32
lodctr /R
Posted by Jephe Wu at 7/01/2012 07:53:00 PM 0 comments
Labels: backup and restore, ms sql, performance tuning
Older Posts Home
Subscribe to: Posts (Atom)
Search This Blog
Follow by Email
Subscribition
Posts
All Comments
Bookmark
Bookmark linuxtechres.blogspot.com
About me
15 years strong Linux/Unix (mainly RHEL/CentOS/OL) sysadmin experience in large production environments and data centers;
RHCE on RHEL5
5 years Oracle DBA experience; also strong on MySQL/MSSQL/DB2 databases
Email: jephewu at gmail.com
Linkedin: http://www.linkedin.com/in/jephewu