size_clause issue of parameters method_opt [message #522979] |
Thu, 15 September 2011 00:30 |
andy huang
Messages: 498 Registered: July 2011
|
Senior Member |
|
|
Dear all,
What is the function by specifed the size_clause of parameters
method_opt?
dbms_stats.gather_table_stats
method_opt := FOR ALL [INDEXED | HIDDEN] COLUMNS [size_clause]
FOR COLUMNS [size_clause]
column|attribute [size_clause]
[,column|attribute [size_clause] ... ]
size_clause := SIZE [integer | auto | skewonly | repeat],
where integer is between 1 and 254
|
|
|
|
|
|
Re: size_clause issue of parameters method_opt [message #522994 is a reply to message #522988] |
Thu, 15 September 2011 01:27 |
andy huang
Messages: 498 Registered: July 2011
|
Senior Member |
|
|
hi,
when i specified size auto,it can not generate histogram,why?
SQL> desc TB_HXL_ID;
Name Null? Type
----------------------------------------- -------- ----------------------------
ID NUMBER
SQL> select count(1) from tb_hxl_id;
COUNT(1)
----------
100000
SQL> Begin
2 Dbms_Stats.Delete_Table_Stats(Ownname => 'HXL', Tabname => 'TB_HXL_ID');
3
4 End;
5 /
PL/SQL procedure successfully completed.
SQL> Begin
2 Dbms_Stats.Gather_Table_Stats(Ownname => 'HXL',
3 Tabname => 'TB_HXL_ID',
4 Estimate_Percent => 100,
5 Method_Opt => 'for all columns size auto',
6 Degree => 2,
7 Cascade => True);
8 End;
9 /
PL/SQL procedure successfully completed.
There is not generate histogram.
SQL> Select Aa.Histogram
2 From Dba_Tab_Columns Aa
3 Where Aa.Owner = 'HXL'
4 And Aa.Table_Name = 'TB_HXL_ID'
5 /
HISTOGRAM
---------------
NONE
|
|
|
|
|
|
|
|
|
|
|
|
Re: size_clause issue of parameters method_opt [message #523016 is a reply to message #523015] |
Thu, 15 September 2011 03:24 |
andy huang
Messages: 498 Registered: July 2011
|
Senior Member |
|
|
hi,
It can generate a histogram when i using for all columns size 20,i want to know the detail rules about oracle generate histogram, can you help me?
SQL> Begin
2 Dbms_Stats.Delete_Table_Stats(Ownname => 'HXL', Tabname => 'TB_HXL_ID');
3
4 End;
5 /
PL/SQL procedure successfully completed.
SQL> Begin
2 Dbms_Stats.Gather_Table_Stats(Ownname => 'HXL',
3 Tabname => 'TB_HXL_ID',
4 Estimate_Percent => 100,
5 Method_Opt => 'for all columns size 20',
6 Degree => 2,
7 Cascade => True);
8 End;
9 /
PL/SQL procedure successfully completed.
SQL> Select Aa.Histogram
2 From Dba_Tab_Columns Aa
3 Where Aa.Owner = 'HXL'
4 And Aa.Table_Name = 'TB_HXL_ID';
HISTOGRAM
---------------
HEIGHT BALANCED
|
|
|
|
Re: size_clause issue of parameters method_opt [message #523439 is a reply to message #523028] |
Sun, 18 September 2011 16:16 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
While it is true that many details are not published, I don't accept that as an answer. Whether or not "details subject to change", Oracle must provide suficient information for us to reliably use each option of a command. Otherwise how are we to know when we should use them and if we are using them correctly. I note as well that the manuals often leave out necessary information or are sometimes vague in their explanations. I offer here then, my understanding of how this works. Be it accurate or not, this is how I work with stats.
It was noted that this was about generating histograms. With that in mind let us first explore why historgrams would be needed. Consider the following scenario:
1) A table contains 100 rows with a column called EFF_DATE. The first date is 1-jan-1900 and the last date is 31-dec-1999. There are in fact 100 years worth of dates between these two dates.
2) we have the query
select * from t1 where EFF_DATE >= '01-JAN-1999';
With only basic statistics, Oracle knows what we said in #1 above, that there are about 100 rows in the table, and that the EFF_DATE has a min of 01-jan-1900 and a max of 31-dec-1999. So how many rows should come back from the query?
Oracle will assume that there is an even distribution of rows between the two dates (well, this isn't exactly true either (that ol "density" thing), but we will make believe it is so for the greater discusion here). Since 01-jan-1999 is one year before the max date, Oracle will figure that >= '01-JAN-1999' should bring back 1 year out of 100 or 1% of the data. Since there are about 100 rows in the table, 1% of 100 equals 1 row.
3) But what if the data is mostly new. Let us now assume that there is 1 row where EFF_DATE = '01-jan-1900' and 99 rows where the EFF_DATE = '31-jan-1999'. Given this knowledge, we know that the query above should return 99 out if 100 rows, not 1 as Oracle predicts.
The problem of course is because our data is not evenly distributed with respect to EFF_DATE. The data is SKEWED to the right. Histograms are a way to give Oracle the special knowledge that we noted in #3.
Suppose however we did something simple like use SIZE=10. Oracle would create a little spreadsheet of sorts that looks something like this (it is actually different from this (more compact) but this gives you the idea of what is going on):
low_date high_date numrows
01-jan-1990 31-dec-1990 1
01-jan-1991 31-dec-1991 0
01-jan-1992 31-dec-1992 0
01-jan-1993 31-dec-1993 0
01-jan-1994 31-dec-1994 0
01-jan-1995 31-dec-1995 0
01-jan-1996 31-dec-1996 0
01-jan-1997 31-dec-1997 0
01-jan-1998 31-dec-1998 0
01-jan-1999 ----------- 99
Given this spreadsheet, Oracle now knows that the query
select * from t1 where EFF_DATE >= '01-JAN-1999';
will return around 99 rows. Again Oracle stores this information differently and more compact than I show, but you get the idea of how SKEWED data can cause bad plans and how knowing about the SKEWED distribution of the data to some level of detail can provide a better idea of the number of rows a query or query piece will return and thus get us better plans.
So histograms are about dealing with SKEWED data. This is data that is not evenly distributed on some column (or with extended statistics, a set of columns). This leads us to your question: what are the histogram options and what do they do. Well, my understanding is this:
SIZE (20) will create up to 20 buckets of histogram data for each of whatever columns you are collecting histograms on.
SIZE REPEAT will "do what ever you did before"
SIZE SKEWONLY will examine each column and decide for itself if a column is skewed, and of so, collect however many buckets of histograms it thinks it needs to deal with the skew.
SIZE AUTO does the same thing as SKEWONLY but ... it only does it for columns identifed by some "WORKLOAD". What workload is, is not actually clear.
Is it, anyting referenced by a query that is currently in the SQL CACHE at the time the collection is done?
Is it any query that Oracle sees in a WORKLOAD REPOSITORY area?
Is it any query Oracle has saved someplace special that we don't know about?
I actually don't know. What I read says "currently executing" but who knows if that is right.
OK, so who knows better than I on this one? Kevin
|
|
|