English

How To: Set user-defined statistics for st_geometry attributes and st_spatial_index in Oracle

Summary

Instructions provided describe how to set user-defined statistics for st_geometry attributes and st_spatial_index in Oracle. Statistics are used by the Oracle optimizer to select the best access path for data retrieval when executing a SQL statement. St_geometry and st_spatial_index objects also have statistics that are used by the st_domain_stats type to help the Oracle optimizer determine if a spatial index should be used or not as the access path.

Statistics are gathered on tables and indexes when the table is analyzed, either by using the Analyze command in ArcCatalog or the Oracle command dbms_stats.gather_table_stats. When statistics are gathered on a table containing an st_geometry attribute and spatial index, additional statistics are gathered about the geometry and its spatial index, which are stored in the ArcSDE administrator's table st_geometry_index. These statistic values are then used to derive the selectivity of spatial queries and the cost of using the spatial index as the optimizer's access path, verses a full table scan and filter for each candidate row.

In some unique and specific cases, the statistics gathered may not correctly represent the characteristics of the data. Therefore, just as Oracle provides a mechanism to set user-defined statistics with tables and indexes, there is the ability to set an st_geometry and spatial index statistics using the sde.spx_util procedure.

By changing the st_geometry or st_spatial_index statistic values for a table, it influences the optimizer on the cost of using the spatial index and what access path is selected when executing a SQL statement that includes spatial relational operators.

If user statistics have been set, when the next table statistics are gathered on the table, the user defined statistic values will be over written.

Procedure

The sde spx_util package contains two procedures, set_column_stats and set_index_stats, which can be executed to set user-defined statistic values.

Code:
sde.spx_util.set_column_stats
(owner IN VARCHAR2,
table_name IN VARCHAR2,
column_name IN VARCHAR2,
distcnt IN NUMBER DEFAULT NULL,
nullcnt IN NUMBER DEFAULT NULL)

sde.spx_util.set_index_stats
(owner IN VARCHAR2,
table_name IN VARCHAR2,
index_name IN VARCHAR2,
numrows IN NUMBER DEFAULT NULL,
numlblks IN NUMBER DEFAULT NULL,
clstfct IN NUMBER DEFAULT NULL,
density IN NUMBER DEFAULT NULL,
indlevel IN NUMBER DEFAULT NULL)

The following examples demonstrate how to set the number of null geometries for a table and the density (the average number of geometries per grid cell) for a spatial index.

  1. To set the number of NULL geometries for an st_geometry attribute, execute the sde.spx_util.set_column_stats procedure.

    Code:
    SQL> EXECUTE sde.spx_util.set_column_stats('tb','transformers','shape',NULL,3500);

    PL/SQL procedure successfully completed.

    In this example, the transformer's table contains 3,500 rows with NULL shapes. The transformer objects are assets to the organization, which must be included in the transformer's table but have yet to be placed in service and continue to reside in storage and therefore have no geometry value.

    Setting the accurate column statistic to represent the true number of NULL geometries helps inform the Oracle optimizer in selecting the optimal access path when performing a spatial query. By knowing there are a certain number of NULL geometries, the cardinality returned by the optimizer will not mistakenly include the rows that are NULL.
  2. To set the density of the spatial index, execute the sde.spx_util.set_index_stats procedure.

    Code:
    SQL> EXECUTE sde.spx_util.set_index_stats('tb','poles','poles_shp_idx',NULL,NULL,NULL,125);

    PL/SQL procedure successfully completed.

    This example demonstrates setting the density value for the pole table's spatial index named poles_shp_idx. By setting the density value, its informing the optimizer that on average each grid cell contains 125 poles. When a spatial query is executed, the input search envelope is inspected and used as input to calculate the selectivity and cardinality of the spatial query. For example, if the search envelope for the spatial query was half the size of the grid size, then the cardinality for the query would be 62 (meaning 62 poles are likely to be returned by the spatial query). The selectivity is also calculated by dividing the number of candidate features (derived from the density) by the total number of rows in the table minus the number of NULL geometries. Selectivity is a very important number, because it is used as input for calculating a cost of using the spatial index.