Increasing Recovery Area Size

To verify this run the following query. It will show the size of the recovery area and how full it is:

set lines 100
col name format a60
select	name
,	floor(space_limit / 1024 / 1024) "Size MB"
,	ceil(space_used  / 1024 / 1024) "Used MB"
from	v$recovery_file_dest
order by name
/

To fix the problem, you need to either make the flash recovery area larger, or remove some files from it.

If you have the disk space available, make the recovery area larger:

alter system set db_recovery_file_dest_size=<size> scope=both
/

To remove files you must use RMAN. Manually moving or deleting files will have no effect as oracle will be unaware. The obvious choice is to backup and remove some archive log files. However, if you usually write your RMAN backups to disk, this could prove tricky. RMAN will attempt to write the backup to the flash recovery area…which is full. You could try sending the backup elsewhere using a command such as this:

rman target / catalog user/pass@rmancat

run {
allocate channel t1 type disk;
backup archivelog all delete input format '/<temp backup location>/arch_%d_%u_%s';
release channel t1;
}

This will backup all archive log files to a location of your choice and then remove them.

http://www.shutdownabort.com/errors/ORA-19809.php

Enhanced by Zemanta

Recommended init parameters for Websphere Portal

Recommended init parameters for IBM Websphere Portal

wspprd.__db_cache_size=587202560
wspprd.__java_pool_size=16777216
wspprd.__large_pool_size=16777216
wspprd.__shared_pool_size=939524096
wspprd.__streams_pool_size=33554432
*.compatible=’10.2.0.3.0′
*.db_block_size=8192
*.db_cache_size=524288000
*.db_domain=”
*.db_file_multiblock_read_count=16
*.db_files=1024
*.db_recovery_file_dest=’/wspprd/oracle/product/10.2.0/flash_recovery_area’
*.db_recovery_file_dest_size=2147483648
*.job_queue_processes=10
*.open_cursors=1600
*.pga_aggregate_target=314572800
*.pre_page_sga=TRUE
*.processes=600
*.sessions=665
*.sga_max_size=1610612736
*.sga_target=1610612736
*.shared_pool_size=536870912
*.db_writer_processes=20
*.sort_area_size=65536
*.cursor_space_for_time=TRUE
*.session_cached_cursors=500
*.log_checkpoints_to_alert=TRUE
*.log_checkpoint_interval=0
*.log_checkpoint_timeout=0
*.fast_start_mttr_target=3600
*.log_buffer=419430400

Enhanced by Zemanta

SQL Server 2008 Backup

Backup Types

SQL Server 2008 allows you to create four different types of backups:

  • Full Backups
  • Differential Backups
  • Transaction log Backups
  • Filegroup Backups

Full Backups

A full backup captures all pages within a database that contain data. Pages that do not

contain data are not included in the backup.

Because it is more common to back up a database than to restore one, the backup engine

is optimized for the backup process. When a backup is initiated, the backup engine grabs

pages from the data files as quickly as possible, without regard to the order of pages. Because

the backup process is not concerned with the ordering of pages, multiple threads can be used

to write pages to your backup device.

Because changes canbe made to the database while a backup is running, SQL Server needs to be able to accommodate the changes while also ensuring that backups are consistent for restore purposes. To ensure both concurrent access and backup consistency, SQL Server performs the steps of the backup procedure as follows:

1. Locks the database, blocking all transactions

2. Places a mark in the transaction log

3. Releases the database lock

4. Extracts all pages in the data fi les and writes them to the backup device

5. Locks the database, blocking all transactions

6. Places a mark in the transaction log

7. Releases the database lock

8. Extracts the portion of the log between the marks and appends it to the backup

The only operations that are not allowed during a full backup are

  • Adding or removing a database file
  • Shrinking a database

The only two parameters required for a backup are the name of the database and the

backup device. When you specify a disk backup device, a directory and a fi le name can

be specifi ed. If a directory is not specifi ed, SQL Server performs a backup to disk and writes

the file to the default backup directory configured for the instance. Although most backups

are written to a single disk file or a single tape device, you can specify up to 64 backup

devices. When you specify more than one backup device, SQL Server stripes the backup

across each of the devices specified.

Enhanced by Zemanta

How to debug a Shell Script on Unix or Lunix

How to debug a Shell Script

Use -x option to debug a shell script

Run a shell script with -x option. This is valid for bash shell scripts as well.

$ ksh -x tesh.sh
$ ksh -v test.sh

Use of set builtin command

korn shell offers debugging options which can be turn on or off using set command.

=> set -x : Display commands and their arguments as they are executed.

=> set -v : Display shell input lines as they are read.

You can use above two command in shell script itself:

<span style="color: rgb(0, 0, 153); font-weight: bold;">#!/bin/ksh</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">clear</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;"># turn on debug mode</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">set -x</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">for f in *</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">do</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">   file $f</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">done</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;"># turn OFF debug mode</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">set +x</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">ls</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;"># more commands</span>

You can replace standard

#!/bin/ksh


with (for debugging)

#!/bin/ksh -xv

Use of intelligent DEBUG function

Add special variable _DEBUG. Set to `on’ when you need to debug a script:

_DEBUG="on"

Put the following function at the beginning of the script:

function DEBUG()<br />{<br /> [ "$_DEBUG" == "on" ] &&  $@ || :<br />}

Now wherever you need debugging simply use DEBUG function

DEBUG echo "File is $filename"

OR

DEBUG set -x

Cmd1

Cmd2

DEBUG set +x

When debugging done and before moving a script to production set _DEBUG to off

No need to delete debug lines.

_DEBUG="off" # set to anything but not to 'on'

Sample script:

<span style="color: rgb(0, 0, 153); font-weight: bold;">#!/bin/bash</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">_DEBUG="on"</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">function DEBUG()</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">{</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;"> [ "$_DEBUG" == "on" ] &&  $@ || :</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">}</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">DEBUG echo 'Reading files'</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">for i in *</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">do</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">  grep 'something' $i > /dev/null</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">  [ $? -eq 0 ] && echo "Found in $i file" || :</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">done</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">DEBUG set -x</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">a=2</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">b=3</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">c=$(( $a + $b ))</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">DEBUG set +x</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">echo "$a + $b = $c"</span><br />

Save and run the script:

$ ./script.sh

Output:

<span style="color: rgb(0, 0, 153); font-weight: bold;">Reading files</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">Found in xyz.txt file</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">+ a=2</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">+ b=3</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">+ c=5</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">+ DEBUG set +x</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">+ '[' on == on ']'</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">+ set +x</span><br style="color: rgb(0, 0, 153); font-weight: bold;" /><span style="color: rgb(0, 0, 153); font-weight: bold;">2 + 3 = 5</span><br />

Now set DEBUG to off

_DEBUG="off"

Run script:

$ ./script.sh

Output:

<span style="font-weight: bold; color: rgb(0, 0, 153);">Found in xyz.txt file</span><br style="font-weight: bold; color: rgb(0, 0, 153);" /><span style="font-weight: bold; color: rgb(0, 0, 153);">2 + 3 = 5</span><br />

You can also try to use DEBUG as an alias instead of function.

Links

Korn Shell Debugger by OReilly

Bash Debugger Project

Powered by ScribeFire.

Join Types NL,Hash and Sort-Merge

Base Article

Nested loop (loop over loop)

In this algorithm, an outer loop is formed which consists of few entries and then for each entry, and inner loop is processed.

e.g.

Select tab1.*, tab2.* from tabl, tab2 where tabl.col1=tab2.col2;

It is processed like:

For i in (select * from tab1) loop
For j in (select * from tab2 where col2=i.col1) loop
Display results;
End loop;
End loop;

The Steps involved in doing nested loop are:

a) Identify outer (driving) table

b) Assign inner (driven) table to outer table.

c) For every row of outer table, access the rows of inner table.

In execution plan it is seen like this:

NESTED LOOPS
outer_loop
inner_loop

When optimizer uses nested loops?

Optimizer uses nested loop when we are joining tables containing small number of rows with an efficient driving condition. It is important to have an index on column of inner join table as this table is probed every time for a new value from outer table.

Optimizer may not use nested loop in case:

  1. No of rows of both the table is quite high
  2. Inner query always results in same set of records
  3. The access path of inner table is independent of data coming from outer table.

Note: You will see more use of nested loop when using FIRST_ROWS optimizer mode as it works on model of showing instantaneous results to user as they are fetched. There is no need for selecting caching any data before it is returned to user. In case of hash join it is needed and is explained below.

Hash join

Hash joins are used when the joining large tables. The optimizer uses smaller of the 2 tables to build a hash table in memory and the scans the large tables and compares the hash value (of rows from large table) with this hash table to find the joined rows.

The algorithm of hash join is divided in two parts

  1. Build a in-memory hash table on smaller of the two tables.
  2. Probe this hash table with hash value for each row second table

In simpler terms it works like

Build phase

For each row RW1 in small (left/build) table loop
Calculate hash value on RW1 join key
Insert RW1 in appropriate hash bucket.
End loop;

Probe Phase

For each row RW2 in big (right/probe) table loop
Calculate the hash value on RW2 join key
For each row RW1 in hash table loop
If RW1 joins with RW2
Return RW1, RW2
End loop;
End loop;

When optimizer uses hash join?

Optimizer uses has join while joining big tables or big fraction of small tables.

Unlike nested loop, the output of hash join result is not instantaneous as hash joining is blocked on building up hash table.

Note: You may see more hash joins used with ALL_ROWS optimizer mode, because it works on model of showing results after all the rows of at least one of the tables are hashed in hash table.

Sort merge join

Sort merge join is used to join two independent data sources. They perform better than nested loop when the volume of data is big in tables but not as good as hash joins in general.

They perform better than hash join when the join condition columns are already sorted or there is no sorting required.

The full operation is done in two parts:

  • Sort join operation

get first row RW1 from input1
get first row RW2 from input2.

  • Merge join operation

while not at the end of either input loop
if RW1 joins with RW2
get next row R2 from input 2
return (RW1, RW2)
else if RW1 < style=”"> get next row RW1 from input 1
else
get next row RW2 from input 2
end loop

Note: If the data is already sorted, first step is avoided.

Important point to understand is, unlike nested loop where driven (inner) table is read as many number of times as the input from outer table, in sort merge join each of the tables involved are accessed at most once. So they prove to be better than nested loop when the data set is large.

When optimizer uses Sort merge join?

a) When the join condition is an inequality condition (like <, <=, >=). This is because hash join cannot be used for inequality conditions and if the data set is large, nested loop is definitely not an option.

b) If sorting is anyways required due to some other attribute (other than join) like “order by”, optimizer prefers sort merge join over hash join as it is cheaper.

Note: Sort merge join can be seen with both ALL_ROWS and FIRST_ROWS optimizer hint because it works on a model of first sorting both the data sources and then start returning the results. So if the data set is large and you have FIRST_ROWS as optimizer goal, optimizer may prefer sort merge join over nested loop because of large data. And if you have ALL_ROWS as optimizer goal and if any inequality condition is used the SQL, optimizer may use sort-merge join over hash join

Reblog this post [with Zemanta]

Optimize Joins and Query Plan Changes

Joins and Query Plans

Details of the article can be viewed at ask Tom
Below are two queries posted by a user on askTom site.

</pre>1) SELECT distinct D.deptno, D.dname
    FROM dept D, emp E
        WHERE E.deptno = D.deptno
        ORDER BY D.deptno;

2) SELECT D.deptno, D.dname
    FROM dept D
        WHERE EXISTS
                 (
                 SELECT 1
                 FROM emp E
                 WHERE E.deptno = D.deptno
                 )
        ORDER BY D.deptno;

<span style="color: rgb(0, 0, 0); font-family: verdana;"><span>In general, you should phrase the queries in the manner  that says it best. If one set of joins were particularly efficient to use in all  cases, Oracle would not have implemented the rest of them! </span></span><br style="color: rgb(0, 0, 0); font-family: verdana;" /><br style="color: rgb(0, 0, 0); font-family: verdana;" /><span style="color: rgb(0, 0, 0); font-family: verdana;"><span>In general, you use a join when you need data from more  than one table in the ultimate SELECT list. Here you need data only from DEPT,  so it is unlikely that you would consider a join. You would instead consider  using either WHERE EXISTS or WHERE IN. (The cost-based optimizer [CBO] sees them  as more or less equivalent—you might, too.)</span></span><br style="color: rgb(0, 0, 0); font-family: verdana;" />
<span>I would write the query as either</span>

<pre>select deptno, dname
  from dept
 where exists
   ( select NULL
       from emp
      where emp.deptno
         = dept.deptno )
 order by deptno;

or

select deptno, dname<br />  from dept<br /> where deptno in <br />( select deptno <br />    from emp )<br /> order by deptno;

In both cases, the optimizer would employ either a semi join, which you cannot specify but the optimizer may perform (it stops joining after the first hit), or an index probe using NESTED LOOPS into EMP to test for row existence.

This example demonstrates, in general, what happens. When the optimizer detects that DEPT is small and EMP is large, it index-probes the large EMP table for each row in DEPT. On the other hand, if it deems DEPT to be large, the optimizer will do a pure semi join in bulk, with no indexes. I’m using a copy of the EMP and DEPT tables for the test:

SQL> create table emp as<br />  2  select * from scott.emp;<br />Table created.<br /><br />SQL> create table dept as<br />  2  select * from scott.dept;<br />Table created.<br /><br />SQL> create index emp_deptno_idx<br />  2  on emp(deptno);<br />Index created.

Then I set the first representative statistics. I use SET_TABLE_STATS to make EMP appear large, with 1,000,000 rows, and DEPT appear small, with only 100 rows. Additionally, I tell the optimizer about the index I envision having on EMP(DEPTNO). Because Oracle Database 10g Release 2 computes statistics on an index creation by default, I start by removing any existing statistics on this index and putting in place statistics that might be more representative:

 <br />SQL> begin<br />  2   dbms_stats.set_table_stats<br />  3   ( user,<br />  4    'EMP',<br />  5    numrows => 1000000,<br />  6    numblks => 100000 );<br />  7   dbms_stats.delete_index_stats<br />  8   ( user,<br />  9    'EMP_DEPTNO_IDX' );<br /> 10   dbms_stats.set_index_stats<br /> 11   ( user,<br /> 12    'EMP_DEPTNO_IDX',<br /> 13     numrows => 1000000,<br /> 14     numdist => 10000,<br /> 15     numlblks =>10000 );<br /> 16   dbms_stats.set_column_stats<br /> 17   ( user,<br /> 18    'EMP',<br /> 19    'DEPTNO',<br /> 20     DISTCNT => 10000 );<br /> 21   dbms_stats.set_table_stats<br /> 22   ( user,<br /> 23    'DEPT',<br /> 24     numrows=> 100,<br /> 25     numblks => 100 );<br /> 26  end;<br /> 27  /

When using the CBO with a large EMP and a small DEPT, Oracle Database might use a plan such as the one in Listing 1. The optimizer opts to read every row in DEPT and then performs an index range scan—the index probe into the EMP table—to see if the row exists in the other table. Because I have relatively few index probes to perform (about 100 is what the optimizer thinks) and because the EMP table is large, the optimizer uses this plan in both the WHERE IN and WHERE EXISTS cases.

Code Listing 1: CBO plan with large EMP and small DEPT

SQL> set autotrace traceonly explain<br />SQL> select deptno, dname<br />  2    from dept where deptno  in<br />  3    ( select deptno<br />  4        from emp )<br />  5   order by deptno;<br /><br />Execution Plan<br />-----------------------------------------------------------<br />Plan hash value: 3383088615<br /><br />-----------------------------------------------------------<br />|  Id |	Operation	| Name	        |   Rows | Bytes  |<br />-----------------------------------------------------------<br />|   0 |SELECT STATEMENT	|		|   100	 |  3500  |<br />|   1 |SORT ORDER BY	|		|   100	 |  3500  |<br />|   2 |NESTED LOOPS SEMI|		|   100	 |  3500  | <br />|   3 |TABLE ACCESS FULL| DEPT	        |   100	 |  2200  | <br />|*  4 |INDEX RANGE SCAN	| EMP_DEPTNO_IDX|  1000K |    12M |<br />-----------------------------------------------------------<br /><br />Predicate Information (identified by operation id):<br />-----------------------------------------------------------<br /><br />   4 - access("DEPTNO"="DEPTNO")<br /><br />SQL> select deptno, dname<br />  2    from dept where exists<br />  3    ( select null<br />  4        from emp<br />  5       where emp.deptno =<br />  6            dept.deptno )<br />  7   order by deptno;<br /><br />Execution Plan<br />------------------------------------------------------------<br />Plan hash value: 3383088615<br /><br />------------------------------------------------------------<br />|  Id |	Operation	 | Name	         |   Rows | Bytes  |<br />------------------------------------------------------------<br />|   0 |	SELECT STATEMENT |	         |   100  |  3500  |<br />|   1 | SORT ORDER BY	 |		 |   100  |  3500  |<br />|   2 | NESTED LOOPS SEMI|		 |   100  |  3500  |<br />|   3 | TABLE ACCESS FULL| DEPT	         |   100  |  2200  |<br />|*  4 | INDEX RANGE SCAN | EMP_DEPTNO_IDX|  1000K |    12M |<br />------------------------------------------------------------<br /><br />Predicate Information (identified by operation id):<br />------------------------------------------------------------<br /><br />   4 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")

Compare that with the RBO plan, shown in Listing 2. When using the RBO, the developer must ask, “What size will EMP and DEPT be?” and, depending on the answer, choose either WHERE IN or WHERE EXISTS to get an “optimal” plan. The RBO comes up with two different plans based solely on how the question was asked, not on how large the amount of data was. This is one of the major advantages of the CBO over the RBO: Developers don’t have to know how to write the best query in all cases—which is not to say that they don’t need to learn SQL! They just do not have to figure out minutiae such as where they should use WHERE IN versus WHERE EXISTS.

Code Listing 2: RBO plan

SQL> select /*+ RULE */<br />  2    deptno, dname<br />  3    from dept where deptno  in<br />  4    ( select deptno<br />  5        from emp )<br />  6   order by deptno;<br /><br />Execution Plan<br />-------------------------------------<br />Plan hash value: 1932208647<br /><br />-------------------------------------<br />|  Id |	Operation	 | Name	    |<br />-------------------------------------<br />|   0 | SELECT STATEMENT |	    |<br />|   1 | MERGE JOIN       |	    |<br />|   2 | SORT JOIN	 |	    |<br />|   3 | TABLE ACCESS FULL| DEPT     |<br />|*  4 | SORT JOIN	 |	    |<br />|   5 | VIEW		 | VW_NSO_1 |			<br />|   6 | SORT UNIQUE	 |          |<br />|   7 | TABLE ACCESS FULL| EMP	    |<br />-------------------------------------<br /><br />Predicate Information (identified by operation id):<br />-------------------------------------<br /><br />   4 - access("DEPTNO"="$nso_col_1")<br />       filter("DEPTNO"="$nso_col_1")<br /><br />Note<br />----------<br />   - rule based optimizer used (consider using cbo)<br /><br />SQL> select /*+ RULE */<br />  2    deptno, dname<br />  3    from dept where exists<br />  4    ( select null<br />  5        from emp<br />  6       where emp.deptno =<br />  7            dept.deptno )<br />  8   order by deptno;<br /><br />Execution Plan<br />----------------------------------------<br />Plan hash value: 4109416194<br /><br />----------------------------------------<br />| Id | Operation	| Name	       |<br />----------------------------------------<br />|  0 | SELECT STATEMENT	|	       |<br />|  1 | SORT ORDER BY	|	       |<br />|* 2 | FILTER		|	       |<br />|  3 | TABLE ACCESS FULL| DEPT 	       |<br />|* 4 | INDEX RANGE SCAN	| EMP_DEPTNO_ID|<br />----------------------------------------<br /><br />Predicate Information (identified by operation id):<br />----------------------------------------<br /><br />   2 - filter( EXISTS (SELECT 0 FROM "EMP" "EMP" WHERE<br />                "EMP"."DEPTNO"=:B1))<br />   4 - access("EMP"."DEPTNO"=:B1)

Now, back to the CBO. As the amount of data in DEPT increases—and the optimizer decides that it will have to perform significantly more index probes into the EMP table—the query plans change. They go from using NESTED LOOPS and INDEX RANGE SCANS to more-efficient bulk operations. To see this, I tell the optimizer that DEPT is much larger now

SQL> begin<br />  2     dbms_stats.set_table_stats<br />  3     ( user,<br />  4      'DEPT',<br />  5       numrows=> 100000,<br />  6       numblks => 10000 );<br />  7  end;<br />  8  /

and then rerun the queries and review the resulting plans, as shown in Listing 3. As you can see, the optimizer, in both cases, now chooses to perform full scans and a nice big hash semi join—a join that stops after the first match—instead of using an index probe, because it would have to use the index 100,000 times.

Code Listing 3: CBO plan with larger DEPT

SQL> set autotrace traceonly explain<br />SQL> select deptno, dname<br />  2    from dept where deptno  in<br />  3    ( select deptno<br />  4        from emp )<br />  5   order by deptno;<br /><br />Execution Plan<br />-----------------------------------------------------------------<br />Plan hash value: 3127359958<br /><br />-----------------------------------------------------------------<br />| Id |	Operation	    | Name	     |	 Rows |  Bytes	|	<br />-----------------------------------------------------------------<br />|  0 |	SELECT STATEMENT    |		     |   100K |  3417K	|	<br />|  1 |  SORT ORDER BY	    |		     |   100K |  3417K	|<br />|* 2 |  HASH JOIN SEMI	    |		     |   100K |  3417K	|<br />|  3 |  TABLE ACCESS FULL   | DEPT	     |   100K |  2148K	|	<br />|  4 |  INDEX FAST FULL SCAN| EMP_DEPTNO_IDX |   1000K|    12M	|<br />-----------------------------------------------------------------<br />Predicate Information (identified by operation id):<br />-----------------------------------------------------------------<br /><br />   2 - access("DEPTNO"="DEPTNO")<br /><br />SQL> select deptno, dname<br />  2    from dept where exists<br />  3    ( select null<br />  4        from emp<br />  5       where emp.deptno =<br />  6            dept.deptno )<br />  7   order by deptno;<br /><br />Execution Plan<br />---------------------------------------------------------------<br />Plan hash value: 3127359958<br /><br />---------------------------------------------------------------<br />| Id |	Operation	    | Name	    |	Rows |   Bytes|	<br />---------------------------------------------------------------<br />|  0 |	SELECT STATEMENT    |		    |   100K |  3417K |	<br />|  1 |  SORT ORDER BY	    |		    |   100K |  3417K | <br />|* 2 |  HASH JOIN SEMI	    |		    |   100K |  3417K | <br />|  3 |  TABLE ACCESS FULL   | DEPT	    |   100K |  2148K |	<br />|  4 |  INDEX FAST FULL SCAN| EMP_DEPTNO_IDX|   1000K|    12M |	<br />---------------------------------------------------------------<br /><br />Predicate Information (identified by operation id):<br />---------------------------------------------------------------<br /><br />   2 - access("EMP"."DEPTNO"="DEPT"."DEPTNO")

Another important thing to note about these examples is that the RBO never performed a semi join, as the CBO did. The RBO has a very limited set of access paths and join operations available—it simply doesn’t know how to do a semi join and won’t use that join operation. Instead it performs operations such as the SORT DISTINCT it did in Listing 2, which is much less efficient than the cost-based plan in this case. Another reason to abandon the RBO in favor of the CBO!

Why Does My Plan Change?
I have noticed that gathering statistics against tables where none of the data has changed can cause the query plans against those tables to change. For example, one day I gather statistics, run some queries, and save the query plans in a report. The next day, I gather statistics (but I have not modified any of the data in any table) and I discover that the plans are different. What could cause that? The new query plans were better, but I would like to understand how this could be.

This change is likely due to the default value of the METHOD_OPT parameter used with DBMS_STATS. In Oracle Database 10g, the METHOD_OPT parameter defaults to a SIZE AUTO. After you ran a query, the database remembered the predicates and updated a dictionary table, SYS.COL_USAGE$. Then, the next time you ran DBMS_STATS to gather statistics on these tables, DBMS_STATS queried that table to find out what columns should have histograms collected automatically, based on past query workload. It looked at your predicates and said, “Hmm, these columns are candidates for histograms based on the queries the end users have been running.”

You can see this easily with a relatively small example. I start out by creating a small table with some skewed data in a column named ID and gathering the default set of statistics on it. The data in the ID column is such that for the values 0 through 4, about 20 percent of the table will be retrieved, but for the value 99, only a single row will be returned. I’m using this exaggerated example just to see the plans change:

SQL> create table t<br />  2  as<br />  3  select mod(rownum,5) id, a.*<br />  4    from all_objects a;<br />Table created.<br />SQL> update t<br />  2     set id = 99<br />  3   where rownum = 1;<br />1 row updated.<br /><br />SQL> create index t_idx on t(id);<br />Index created.<br /><br />SQL> begin<br />  2   dbms_stats.gather_table_stats<br />  3   ( user, 'T' );<br />  4  end;<br />  5  /<br /><br />SQL> select column_name, count(*)<br />  2    from user_tab_histograms<br />  3   where table_name = 'T'<br />  4     and column_name = 'ID'<br />  5   group by column_name;<br /><br />COLUMN_NAME   COUNT(*)<br />-----------   ------------<br />ID                  2

So, right now, the ID column doesn’t have complete histograms, by default—just two buckets, as shown by the query against USER_TAB_HISTOGRAMS. To understand the skewed nature of the data, I need a histogram with more than just two buckets; the two entries in USER_TAB_HISTOGRAMS tell the optimizer only the high and low values right now.

The optimizer knows the high value (99), the low value (0), the number of distinct values (6 in this case), and the number of rows in my table T (50,119 when I tested). Given those facts, the optimizer will believe that WHERE ID = will return about 50,119/6, or 8,353 rows. Sure enough, when I query with ID=1 or ID=99, I observe the results in Listing 4.

Code Listing 4: CBO plans when ID=1 and ID=99

SQL> set autotrace traceonly explain<br />SQL> select *<br />  2    from t<br />  3   where id = 1;<br /><br />----------------------------------------------------------------------------<br />|  Id |	Operation	 | Name	|  Rows	|  Bytes|   Cost (%CPU)	|  Time	   |<br />----------------------------------------------------------------------------<br />|   0 |	SELECT STATEMENT |	|  8353	|   783K|   163	(2)	| 00:00:02 |<br />|*  1 | TABLE ACCESS FULL| T	|  8353	|   783K|   163	(2)	| 00:00:02 |<br />----------------------------------------------------------------------------<br /><br />SQL> select *<br />  2    from t<br />  3   where id = 99;<br /><br />Execution Plan<br />----------------------------------------------------------------------------<br />Plan hash value: 1601196873<br /><br />----------------------------------------------------------------------------<br />|  Id |	Operation	 | Name	|   Rows |  Bytes|  Cost (%CPU)	| Time	   |<br />----------------------------------------------------------------------------<br />|   0 |	SELECT STATEMENT |	|   8353 |   783K|   163 (2)	| 00:00:02 |<br />|*  1 | TABLE ACCESS FULL| T	|   8353 |   783K|   163 (2)	| 00:00:02 |<br />----------------------------------------------------------------------------

Now, I immediately gather statistics, using the same command as before:

SQL> begin<br />  2   dbms_stats.gather_table_stats<br />  3   ( user, 'T' );<br />  4  end;<br />  5  /<br />SQL> select column_name, count(*)<br />  2    from user_tab_histograms<br />  3   where table_name = 'T'<br />  4     and column_name = 'ID'<br />  5   group by column_name;<br /><br />COLUMN_NAME   COUNT(*)<br />------------  -----------<br />ID                  5

Note, however, that I have more than two buckets for my histograms. DBMS_STATS, using the AUTO setting, gathered more information here. If I were to query the SYS.COL_USAGE$ table right now, I would discover that a new row was added, indicating that I have some queries in my system that use equality predicates against this particular database column. That is the “magic” that caused DBMS_STATS to change how it gathered statistics and caused a dramatic change in my resulting query plans, as shown in Listing 5.

Code Listing 5: New CBO plans when ID=1 and ID=99

  2    from t<br />  3   where id = 1;<br /><br />Execution Plan<br />----------------------------------------------------------------------------<br />Plan hash value: 1601196873<br /><br />----------------------------------------------------------------------------<br />| Id |	Operation	 | Name	|  Rows	|  Bytes|  Cost (%CPU)	| Time	   |<br />----------------------------------------------------------------------------<br />|  0 |	SELECT STATEMENT |	|  10260|   961K|   164	(2)	| 00:00:02 |<br />|* 1 |  TABLE ACCESS FULL| T	|  10260|   961K|   164	(2)	| 00:00:02 |<br />----------------------------------------------------------------------------<br /><br />SQL> select *<br />  2    from t<br />  3   where id = 99;<br /><br />Execution Plan<br />----------------------------------------------------------------------------------------<br />Plan hash value: 470836197<br /><br />----------------------------------------------------------------------------------------<br />| Id |  Operation	           | Name    | Rows |  Bytes|  Cost (%CPU)  |  Time    |<br />----------------------------------------------------------------------------------------<br />|  0 |  SELECT STATEMENT	   |	     | 1    |    96	|   2	(0) | 00:00:01 |<br />|  1 |  TABLE ACCESS BY INDEX ROWID|	T    | 1    |    96	|   2	(0) | 00:00:01 |<br />|* 2 |  INDEX RANGE SCAN	   |	T_IDX| 1    |   	|   1	(0) | 00:00:01 |<br />----------------------------------------------------------------------------------------

Note how the cardinalities are very different—8,353 changed to either 10,260 or 1. The resulting change in estimated cardinality led to a change in the overall cost of the first query plan and resulted in an entirely different (but better) query plan in the second case.

This is a significant fact for a DBA to be aware of. As the queries submitted by your end users change over time, the AUTO feature of statistics gathering could change what statistics are gathered over time as well. Unless you understand this feature, enabled by default in Oracle Database 10g, it can look as if inexplicable magic is happening in your database.

Powered by ScribeFire.

Reblog this post [with Zemanta]

Unicode Characterset in Oracle

Before starting this post let’s have an idea about unicode. Unicode is a Universal encoding scheme which is designed to include far more characters than the normal character set, in fact, Unicode wants to be able to list ALL characters. So, with unicode support in oracle data from any languages can be stored and retrieved from oracle.
Oracle supports unicode within many of the character sets starting from Oracle 7.
Below is the list of character sets that is used to support unicode in oracle.

 

Database Character Set Statement of Direction

A list of character sets has been compiled in Table A-4, "Recommended ASCII Database Character Sets" and Table A-5, "Recommended EBCDIC Database Character Sets" that Oracle Corporation strongly recommends for usage as the database character set. Other Oracle-supported character sets that do not appear on this list can continue to be used in Oracle Database 10g Release 2, but may be desupported in a future release. Starting with the next major functional release after Oracle Database 10g Release 2, the choice for the database character set will be limited to this list of recommended character sets for new system deployment. Customers will still be able to migrate their existing databases in the next major functional release after Oracle Database 10g Release 2 even if the character set is not on the recommended list. However, Oracle suggests that customers migrate to a recommended character set as soon as possible. At the top of the list of character sets Oracle recommends for all new system deployment is the Unicode character set AL32UTF8.

Choosing Unicode as a Database Character Set

Oracle Corporation recommends using Unicode for all new system deployments. Migrating legacy systems eventually to Unicode is also recommended. Deploying your systems today in Unicode offers many advantages in usability, compatibility, and extensibility. Oracle Database’s comprehensive support enables you to deploy high-performing systems faster and more easily while utilizing the advantages of Unicode. Even if you do not need to support multilingual data today or have any requirement for Unicode, it is still likely to be the best choice for a new system in the long run and will ultimately save you time and money as well as give you competitive advantages. See Chapter 6, "Supporting Multilingual Databases with Unicode" for more information about Unicode.

Choosing a National Character Set

A national character set is an alternate character set that enables you to store Unicode character data in a database that does not have a Unicode database character set. Other reasons for choosing a national character set are:

  • The properties of a different character encoding scheme may be more desirable for extensive character processing operations.

  • Programming in the national character set is easier.

SQL NCHAR, NVARCHAR2, and NCLOB datatypes have been redefined to support Unicode data only. You can use either the UTF8 or the AL 16UTF16 character set. The default is AL16UTF16.

 

Character Sets

http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14225/applocaledata.htm#i635016

Oracle-supported character sets are listed in the following sections according to three broad categories.

In addition, common character set subset/superset combinations are listed. Some character sets can only be used with certain data types. For example, the AL16UTF16 character set can only be used as an NCHAR character set, and not as a database character set.

 

Can You use AL16UTF16 as NLS_CHARACTERSET?

No, AL16UTF16 can only be used as NLS_NCHAR_CHARACTERSET in 9i and above. Trying to create a database with  a AL16UTF16 NLS_CHARACTERSET will fail.

(Source of this answer is REPETTAS WORDPRESS BLOG )

 

1) AL24UTFFSS:

This character set was the first Unicode character set supported by Oracle. The AL24UTFFSS encoding scheme was based on the Unicode 1.1 standard, which is now obsolete. This unicode character set was used between oracle version 7.2 to 8.1.

 

2) UTF-8:

UTF8 was the UTF-8 encoded character set in Oracle8 and 8i. It followed the
Unicode 2.1 standard between Oracle 8.0 and 8.1.6, and was upgraded to Unicode
version 3.0 for oracle versions 8.1.7, 9i, 10g and 11g. If supplementary characters are inserted into in a UTF8 database encoded with Unicode version 3.0, then the actual data will be treated as 2 separate undefined characters, occupying 6 bytes in storage. So for fully support of supplementary characters use AL32UTF8 character set instead of UTF8.

3) UTFE:

UTFE has the same properties as UTF8 on ASCII based platforms. As of UTF8 it is used in different oracle versions.

 

4) AL32UTF8:

This is the UTF-8 encoded character set introduced in Oracle9i.
In Oracle 9.2 AL32UTF8 implemented unicode 3.1,
in 10.1 it implemented the Unicode 3.2 standard,
in Oracle 10.2 it supports the Unicode 4.01 standard and
in Oracle 11.1 it supports the Unicode 5.0.
AL32UTF8 was introduced to provide support for the newly defined supplementary characters. All supplementary characters are stored as 4 bytes in AL32UTF8. As while designed UTF-8 there was no concept of supplementary characters therefore UTF8 has a maximum of 3 bytes per character.

5) AL16UTF16: This is the first UTF-16 encoded character set in Oracle. It was introduced in Oracle9i as the default national character set (NLS_NCHAR_CHARACTERSET). It also provides support for the newly defined supplementary characters. All supplementary characters are stored as 4 bytes.
As with AL32UTF8, the plan is to keep enhancing AL16UTF16 as
necessary to support future version of the Unicode standard.
AL16UTF16 cannot be used as a database character set (NLS_CHARACTERSET), it is only used as the national character set (NLS_NCHAR_CHARACTERSET).
Like, AL32UTF8 In Oracle 9.0 AL16UTF16 implemented unicode 3.0,
in Oracle 9.2 it implemented unicode 3.1,
in 10.1 it implemented the Unicode 3.2 standard,
in Oracle 10.2 it supports the Unicode 4.01 standard and
in Oracle 11.1 it supports the Unicode 5.0.

Reblog this post [with Zemanta]

Creating Oracle Database for Websphere Portal

Considerations before Creating database for Websphere Portal

View some important considerations before setting up Oracle databases to work with WebSphere Portal.

For information about creating databases, refer to the Oracle product documentation. For information on the recommended database architecture and the databases you will need to create, see the Planning for Oracle topic. Be sure that all databases to be used with WebSphere Portal are created as UNICODE character set databases.

If you are using remote Oracle databases, you must also copy the ojdbc14.jar file from the remote Oracle server to the WebSphere Portal machine. The typical location is the oracle_home/jdbc/lib directory. Record the copy location on your local machine for future reference.

When creating Oracle databases for use with WebSphere Portal, you should consider the following information:

  • The Oracle databases must be created manually before configuring WebSphere Portal.
  • All databases must be created using UNICODE Database and National character sets such as UTF8, AL32UTF8, or AL16UTF16.
  • It is recommended that all databases to be used with WebSphere Portal are configured in Dedicated Server Mode.
  • Determine if your Oracle server will be remote or local to the WebSphere Portal installation.
  • If using an earlier version of Oracle (9i), ensure that Oracle JVM is also installed.
  • After installing the database software for WebSphere Portal, you will need to set the buffer pools allocated to the Oracle database in order for WebSphere Portal to communicate with the Java Content Repository database. Use the following recommended values as a guide. Refer to Oracle product documentation for information on how to set the buffer pools. Recommended initial buffer pool sizes:
    db_block_size = 8192
    db_cache_size = 300M
    db_files = 1024
    log_buffer = 65536
    open_cursors = 1500
    pga_aggregate_target = 200M
    pre_page_sga = true
    processes = 300
    shared_pool_size = 200M

Reblog this post [with Zemanta]

SQL Server Data Warehousing concepts Video

An introduction to SQL Server data warehousing concepts

Data warehousing Concepts
Relational warehouse Concepts
Multi-Dimensional Concepts

Data Warehouse Components

Data Sources

ETL

Relational Data warehouse (SQL Server)

Dimensional Data warehouse (Analysis Services)

Graphical User Interface Tools

  • Use SQL to query relational warehouse
  • Use MDX (Multi dimensional expressions  to query cubes)
Reblog this post [with Zemanta]

Examining data files when SQL Server tempdb is full

Source Link

What does it mean when the tempdb is full?

When SQL Server’s tempdb is full, upper management will often demand a fix, a group of developers will frantically deflect blame and junior DBAs will become afraid to touch anything at all.

As I tell administrators, the first rule of thumb is simple: Keep calm. Never let the scenario unfolding cause extra stress, as that can lead to critical mistakes.

Now that we have set up the scenario, let’s address the question. The tempdb database consists of two parts: the data file(s) in the primary file group, and the tempdb log file(s). Either area could be the culprit, but the error message will tell you which part is full. Let’s concentrate on the data file portion; we will address the log file in a later article.

How can I narrow down the source?

First, let’s take a look at how to determine what takes up the most space, whichserver process ID number (SPID) we are dealing with and which host the request comes from. The query below will return the top 1000 SPIDs that are taking up space in the database. Keep in mind that the values returned are page counts. To make it easier, I included the calculations to come up with the memory values (in megabytes). Also, note that these counters are cumulative over the life of the SPID:

SELECT top 1000
s.host_name, su.[session_id], d.name [DBName], su.[database_id],
su.[user_objects_alloc_page_count] [Usr_Pg_Alloc], su.[user_objects_dealloc_page_count] [Usr_Pg_DeAlloc],
su.[internal_objects_alloc_page_count] [Int_Pg_Alloc], su.[internal_objects_dealloc_page_count] [Int_Pg_DeAlloc],
(su.[user_objects_alloc_page_count]*1.0/128) [Usr_Alloc_MB], (su.[user_objects_dealloc_page_count]*1.0/128)
[Usr_DeAlloc_MB],
(su.[internal_objects_alloc_page_count]*1.0/128) [Int_Alloc_MB], (su.[internal_objects_dealloc_page_count]*1.0/128)
[Int_DeAlloc_MB]
FROM [sys].[dm_db_session_space_usage] su
inner join sys.databases d on su.database_id = d.database_id
inner join sys.dm_exec_sessions s on su.session_id = s.session_id
where (su.user_objects_alloc_page_count > 0 or
su.internal_objects_alloc_page_count > 0)
order by case when su.user_objects_alloc_page_count > su.internal_objects_alloc_page_count then
su.user_objects_alloc_page_count elsesu.internal_objects_alloc_page_count end desc

The next query is similar; it returns the top 1000 items in which SPIDs have de-allocated the most space. This query can be used to trace a process that loops, creates objects as it goes, or creates and deletes many temporary objects as it runs:

SELECT top 1000 s.host_name, su.[session_id], d.name [DBName], su.[database_id],
su.[user_objects_alloc_page_count] [Usr_Pg_Alloc], su.[user_objects_dealloc_page_count] [Usr_Pg_DeAlloc],
su.[internal_objects_alloc_page_count] [Int_Pg_Alloc], su.[internal_objects_dealloc_page_count] [Int_Pg_DeAlloc],
(su.[user_objects_alloc_page_count]*1.0/128) [Usr_Alloc_MB], (su.[user_objects_dealloc_page_count]*1.0/128)
[Usr_DeAlloc_MB],
(su.[internal_objects_alloc_page_count]*1.0/128) [Int_Alloc_MB], (su.[internal_objects_dealloc_page_count]*1.0/128)
[Int_DeAlloc_MB]
FROM [sys].[dm_db_session_space_usage] su
inner join sys.databases d on su.database_id = d.database_id
inner join sys.dm_exec_sessions s on su.session_id = s.session_id
where (su.user_objects_dealloc_page_count > 0 or
su.internal_objects_dealloc_page_count > 0)
order by case whensu.user_objects_dealloc_page_count > su.internal_objects_dealloc_page_count then
su.user_objects_dealloc_page_count elsesu.internal_objects_dealloc_page_count end desc

 

Since the tempdb does not report its size properly after it has shrunk, the following query will get you the available space in tempdb:

SELECT sum(unallocated_extent_page_count) [Free_Pages],
(sum(unallocated_extent_page_count)*1.0/128) [Free_Space_MB]
FROM sys.dm_db_file_space_usage

 

Once you have determined the SPID, you can determine what T-SQL is running with dbcc inputbuffer(SPID).Let’s presume you know the T-SQL code that ran, but you also need to know the temporary table(s) involved. You could run the following:

select * from tempdb.sys.objects where type = 'u'

 

Temporary tables originating from users in T-SQL should have the format of#YourDefinedTblName____UniqueID. This will help you identify the code involved. You could also join the SPID involved from the sys.dm_exec_requestscommand and use sys.dm_exec_sql_text(SQL_Handle) to get the line running at the time, but this requires a "polling loop" to monitor when the script is actually running.