Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Practical assignment 1 - Query optimization (Oracle)

COMP7104-DASC7104

This assignment is a concrete application of the concepts, structures, algorithms presented in class, on query execution and query optimization, with the ORACLE DBMS. This system provides a good example of a sophisticated optimizer based on index structures and comprehensive evaluation algorithms. All the join algorithms described in class are indeed implemented in ORACLE. In addition, the system offers simple and practical tools (AUTOTRACE and EXPLAIN) to understand the execution plan chosen by the optimizer, as well as to obtain performance statistics (I/O cost and CPU cost, among others).

Starting the Oracle DBMS and connecting to it with SQL Developer

I can launch the Oracle server (its Docker image) by running the following command:

docker run -d --name oracle-ee -v

/Users/Bogdan/data/OracleDBData:/opt/oracle/oradata -p 1521:1521 -p

5500:5500 -e ORACLE_PWD='BoGDdanC1!el' container-

registry.oracle.com/database/enterprise:21.3.0.0

In your command, you must replace

1)   /Users/Bogdan/data/OracleDBData with  a  directory that you  created  beforehand  on your machine. This is the directory where the Oracle database will be saved whenever you turn off the Oracle server (its Docker image); in this way, after each restart you can reuse the database.

2)    'BoGDdanC1!el' by your own password (feel free to use mine too…), recall that you need a not- too-simple password, at least 8 characters long, with at least one upper-case, at least one digit, and at least one special character.

The -v option does the directory sharing” between your machine the Docker image, and it works in both directions; your directory will be mapped to the ORCL directory on the Docker image. It is very important to use this option, so that the database is saved on your machine each time you may turn off the Oracle docker image or the Docker desktop altogether.

The -p option indicates that the container’s port 1521 (the one usually used by the Oracle server) is mapped to the same port 1521 of the Docker (the one we will use); same for 5500.

Connecting to the Oracle server is done via an Oracle client, such as SQL Developer, as instructed in the Docker setup document.

You can launch SQL Developer and connect to the Oracle server by clicking on the + icon, then filling the connection form as follows (using your chosen password):

If everything went well you can connect and run a test query for asking the time, such as

SELECT SYSDATE FROM DUAL;

Alternatively, instead of SQL Developer you can use the command line Oracle client SQL*Plus.

The database: creating the tables & indexes, populating the database

Follow in SQL Developer the following steps:

Step 1: Execute all the commands from schema_stage1.sql, as follows:  copy/paste and execute in SQL Developer each block of commands delimited by (----), one block at a time, and observe whether all is ok. You can simply select in SQL Developer the block of commands you want to execute and click on the large green rectangle button (as illustrated below); in this way, all the selected commands will be executed one after the other. Note: only the drop commands, at a first execution, should raise errors. This stage of execution should be rather fast, creating the tables.

Step 1’: Before Step 2 (inserting the data), test that the -v option workedfor you. Just create the tables (Step 1), stop and restart the Oracle docker image, and see if the tables are still there after restarting. That is, if the database has been saved and then reloaded from the directory you indicated with the -v option. Only when this is the case proceed to Step 2.

Step 2: Execute the following commands in SQL Developer, which will populate the tables we created at the previous stage. This stage is slow, may take hours (on my machine it takes roughly 12 hours), make sure you disable sleep mode on your computer and that it is not running on battery but on wall electricity; the advice is to run this stage overnight.

@/Users/Bogdan/data/OracleDBData/data1.sql;

commit;

@/Users/Bogdan/data/OracleDBData/data2.sql;

commit;

Note: Instead of /Users/Bogdan/data/OracleDBData/ you must put the path where you saved the data1.sql and data2.sql scripts.

Note: these two files are large, do not try to open them in a text editor (it will likely crash), you can take a peek and see a few lines from them with the “more” or “tail” commands instead.

Step 3: Like at stage 1, execute all the commands from schema_stage2.sql:  copy/paste and execute in SQL Developer each block of commands delimited by (----), one block at a time, and observe whether all is ok. Some of the steps here may take a little while (few minutes), be patient. You must execute all the commands, no exceptions.

The schema of the database we will use is the following:

ARTIST (IDARTIST, LASTNAME, FIRSTNAME, DOB)

COUNTRY (CODE, NAME, LANGUAGE)

MOVIE (IDMOVIE, TITLE, YEAR, IDMES, GENRE, SUMMARY, CODECOUNTRY)

•    IDMES is the identifier of the movie director

•    We will have four movie tables, called Moviei, for i=1,2,3,4

ROLE (IDMOVIE, IDARTIST, ROLENAME)

MOVIERATER (EMAIL, LASTNAME, FIRSTNAME, PROVINCE)

RATING (IDMOVIE, EMAIL, RATE)

Oracles AUTOTRACE tool

We can observe the execution plan Oracle choses for a given query by using the menu AUTOTRACE, by pressing F6. For that, you must select with the mouse the query to be executed in autotrace mode

And then hit the AUTOTRACE button below (or F6):

The plan that was actually executed (the trace thereof) will be displayed as follows:

There are many available statistics we can see along with a plan (in the trace). We will focus mainly on LAST_CR_BUFFER_GETS (number of pages from the RAM buffer) and LAST_DISK READS (number of pages read from disk).

Similarly, you can EXPLAIN the plan that Oracle intends to execute (before execution), by selecting with the mouse the query to be explained and then hitting the button below (or F10):

So the difference between AUTOTRACE and EXPLAIN is that the former runs the query and shows the executed plan and its main statistics, while the latter shows an estimated best plan and its estimated

statistics without running the query.

Note: what is displayed in the trace of the execution plan, by pressing  F6, is configurable in SQL Developer / Preferences / Database / Autotrace Explain Plan (see screenshot below).

Note: It is important to make sure that in SQL Developer -> Preferences->Database -> Autotrace Explain

Plan  the  option Fetch All Rows is checked.  This  is  an  option  that,  if  not  checked,  allows  query processing to be faster. In some cases, it is easier to fetch only a certain number of results (for instance if no order or grouping is required, etc), and only later, if necessary (if the user scrolls down on the result), other results (all of them if necessary) will be fetched; we do not want this behavior. Ö

Here are some operations and options in Oracle query execution plans (list by no means exhaustive).

OPERATORS

OPTION

MEANING

AGGREGATE

GROUP BY

Computing a one-line result by grouping and aggregating

AND-EQUAL

An operation that has as input a set of rowIds and returns their intersection (used in accesses based on an index)

COUNTING

Counting the number of lines

FILTER

Applies a filter on a set of rows.

INDEX

UNIQUE SCAN

Finding one rowId in an index.

INDEX

RANGE SCAN

Finding one or several rowIds in an index.

MERGE JOIN

Doing a merge join.

NESTED LOOPS

Doing an index nested loops join.

SORT

UNIQUE

Sorting for duplicate elimination

SORT

GROUP BY

Sorting for grouping

SORT

JOIN

Sorting for joining (merge-join)

SORT

ORDER BY

Sorting for ORDER BY

TABLE ACCESS

FULL

Getting all the rows of a table

TABLE ACCESS

CLUSTER

Getting rows by a search key in a clustered index

TABLE ACCESS

BY ROW ID

Getting rows by rowId

Note: rowId and recordId are synonyms.

Oracle implements 3 join algorithms: Nested Loops Join (when there is at least one index), Sort-Merge Join and Hash Join when there is no index.

Environment Cache and page size

We will work with a buffer cache (RAM) of limited size (on purpose, in order to make Oracle’s optimizer more creative”), for that you need to execute the following command:

alter system set db_cache_size = 400M;

Note: the cache size by default is already 400M, so no need to do this systematically.

You can first test if this is indeed the case by

show parameter db_cache_size;

The page size is 8KB. You can obtain this value (8KB is the default) by

show parameter db_block_size ;

SECTION 1

The goal of this first section is to discover and analyze the database that will be used for studying query optimization in Oracle. You will have to gather the necessary information for a good understanding of the execution plans and related statistics.

Bitmap indexes:

A few words on bitmap indexes: remember an index provides pointers to the rows in a table that contain a given key value. We saw that a B+-tree index stores a rowId or a list of rowIds for each key, corresponding to the rows (i.e., records) with that key value.

In a bitmap index, a bitmap for each key value replaces the list ofrowIds. Each bit in the bitmap corresponds to a rowId, and if the bit is set, it means that the row with the corresponding rowId contains the key value.

A mapping function is necessary to convert the bit positions to actual rowId in a heap file; in this way, that the bitmap index provides the same functionality as a regular index.

By this, bitmap indexes can be very effective for queries that contain multiple conditions in the WHERE clause, since they can be tested by logical and/or on bitmaps: in this way, rows that satisfy some, but not all, conditions are filtered out before the table itself is accessed.

Bitmap indexes are very efficient especially for low-cardinality attributes, where the ratio between the number of distinct values of the attribute and the number of records is very low (less than 1<%). Gender would be for example one such low-cardinality attribute in a table listing persons.

Question 1. Fill the following table:

Tables and indexes:

Table name

Nb of records

Nb of pages in heap file (approximative is fine)

Index (type, field/column, alternative

1/2/3, clustered / unclustered)

ARTIST

MOVIE1

MOVIE2

MOVIE3

MOVIE4

MOVIERATER

RATING

COUNTRY

ROLE