SQL Theory

What happens in the background when you run a SQL ?

Below are the step that happen sequentially.

SQL Parser – checks for correct syntax

SQL Optimizer – determines the efficient way for processing the SQL. The execution path or the plan

SQL Execution Engine – Processes each row and completes execution plan.

How to optimize , the SQL query ?

-use only the necessary columns in the query

-use equals instead of IN

-use >= or <= compared to between

-use the LIKE carefully , for example ‘man%’ will used index and will be fast however ‘%man’ won’t

-use OR instead of UNION. Don’t use unnecessary DISTINCT

-instead of using NOT NULL constraint which is SQL intensive , check for NOT NULL of a variable inside a IF block

What is Index ?

Index is an optional structure associated with a table that can sometimes speed up the data access. By creating Index on one or more columns of a table in some cases they retrieve a small set of rows from a table reducing the disk I/O.

CREATE UNIQUE INDEX Cust_Idx ON CUSTOMER_DETAILS (Cust_ID);

How to check all the indexes in a schema ?

SELECT user_tables.table_name, user_indexes.index_name
FROM user_tables JOIN user_indexes on user_indexes.table_name = user_tables.table_name

select * from USER_INDEXES

How to check the Execution Plan for a Query ?

Run the query and then run the below SQL.Here DBMS_XPLAN is oracle package which is specifically created to display the Execution plan. Full and fantastic details mentioned here

https://docs.oracle.com/database/121/ARPLS/d_xplan.htm#ARPLS70132

select plan_table_output
from table(dbms_xplan.display_cursor(null,null,'basic'));

using SQL id

select plan_table_output from
table(dbms_xplan.display_cursor('fnrtqw9c233tt',null,'basic'));

 

The output can be formatted like below

select plan_table_output
from table(dbms_xplan.display('plan_table',null,'basic +predicate +cost'));
select plan_table_output from
table(dbms_xplan.display('plan_table',null,'typical -cost -bytes'));

https://blogs.oracle.com/optimizer/entry/displaying_and_reading_the_execution_plans_for_a_sql_statement

What is a RDBMS ?

It is a system made up of relations. Where there is a relation between the data among tables. Here each table is defined as to be made up of rows and columns.

Famous commands

1.to describe table structure

describe table_name;

2.to know all the constraints on a table

select * from ALL_CONSTRAINTS;

 

3.Insert Command

insert into phone_number (phone_id,country_code,phone_number) VALUES (10,1,6544708)

insert into phone_number (phone_id,country_code,area_code,phone_number) VALUES (11,1,NULL,6544709)

both above statement will enter NULL to the area code , However having area code in the columns and entering just , ,

Blank doesn’t work

4.NVL function

if the value is null then replace that with something else.

SELECT NVL(supplier_city, ‘n/a’) FROM suppliers;

5.Foreign Key Table creation

create table kit (player_id number(4), kit_id number(2), CONSTRAINT kit_id_pk PRIMARY KEY(KIT_ID),CONSTRAINT player_id_fk FOREIGN KEY(player_id) REFERENCES player(player_id))

How to delete the Rows that are referenced ?

Better way, is while creating the table use below constraint. If not done then you will have to Alter the table.

foreign key option as on delete cascade

why gather stats ?

Everyday the object (table) data or structure is modified, hence the new statistics must be gathered.

For example, after loading a significant number of rows into a table, you should collect new statistics on the number of rows.
After updating data in a table, you do not need to collect new statistics on the number of rows but you might need new statistics on the average row length.

How is it done ?

Oracle does it by . simple word, estimation , computation and user defined method

Statistics generated include the following:

Table statistics
-Number of rows
-Number of blocks
-Average row length
Column statistics
-Number of distinct values (NDV) in column
-Number of nulls in column
-Data distribution (histogram)
Index statistics
-Number of leaf blocks
-Levels
-Clustering factor
System statistics
I/O performance and utilization
CPU performance and utilization

History 

When the Oracle database was first introduced the decision of how to execute a SQL statement was determined by a Rule Based Optimizer (RBO). The Rule Based Optimizer, as the name implies, followed a set of rules to determine the execution plan for a SQL statement. The rules were ranked so if there were two possible rules that could be applied to a SQL statement the rule with the lowest rank would be used.

In Oracle Database 7, the Cost Based Optimizer (CBO) was introduced to deal with the enhanced functionality being added to the Oracle Database at this time, including parallel execution and partitioning, and to take the actual data content and distribution into account. The Cost Based Optimizer examines all of the possible plans for a SQL statement and picks the one with the lowest cost, where cost represents the estimated resource usage for a given plan. The lower the cost the more efficient an execution plan is expected to be. In order for the Cost Based Optimizer to accurately determine the cost for an execution plan it must have information about all of the objects (tables and indexes) accessed in the SQL statement, and information about the system on which the SQL statement will be run.

This necessary information is commonly referred to as Optimizer statistics.

Optimizer statistics are a collection of data that describe the database, and the objects in the database. These statistics are used by the Optimizer to choose the best execution plan for each SQL statement. Statistics are stored in the data dictionary, and can be accessed using data dictionary views such as USER_TAB_STATISTICS.

k4

Table and Column Statistics

Table statistics include information on the number of rows in the table, the number of data blocks used for the table, as well as the average row length in the table. The Optimizer uses this information, in conjunction with other statistics, to compute the cost of various operations in an execution plan, and to estimate the number of rows the operation will produce. For example, the cost of a table access is calculated using the number of data blocks combined with the value of the parameter

DB_FILE_MULTIBLOCK_READ_COUNT. You can view table statistics in the dictionary view USER_TAB_STATISTICS.

Column statistics include information on the number of distinct values in a column (NDV) as well as the minimum and maximum value found in the column. You can view column statistics in the dictionary view USER_TAB_COL_STATISTICS. The Optimizer uses the column statistics information in conjunction with the table statistics (number of rows) to estimate the number of rows that will be Understanding Optimizer Statistics 3 returned by a SQL operation. For example, if a table has 100 records, and the table access evaluates an equality predicate on a column that has 10 distinct values, then the Optimizer, assuming uniform data distribution, estimates the cardinality to be the number of rows in the table divided by the number of distinct values for the column or 100/10 = 10.

k5.jpg

select * from USER_TAB_COL_STATISTICS where table_name=’test’

k6

How to view the Constraints on Toad

schema browser->select the schema->constraints
primary key is blue yellow
foreign key is blue color
check is sign of red
unique is green color

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s