Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 1
La Trobe University
Department of Computer Science and Computer Engineering
CSE3DMS Assignment 2020
Objectives
• To understand physical database design as a critical element in achieving overall
database objectives.
• To have a sound understanding of the use of indexes and the trade-offs that must be
considered in their use.
• To understand the concepts of database query optimization.
• To tune database queries using Oracle facilities such as hint and autotrace for better
performance
***
Weight of the assessment
1. CSE3DMS 30 % (Phase 1 + phase 2)
1 The Instructions
1.1 Due Dates:
There are two phases of this assignment.
1. The first phase is due on 10:00am, Monday, 04 May 2020.
2. The second phase will be due on 10:00am, Monday, 01 June 2020.
Delays caused by computer downtime cannot be accepted as a valid reason for late
submission without penalty. Students must plan their work to allow for both scheduled
and unscheduled downtime. No exntenions will be given.
1.2 Can be a group OR a single person assignment
This can be a group OR a single person assignment. A group must have no more than
two memebers. If you don’t have a group and you want to join a group, please send an
Email to Noorul Amin at n.amin@latrobe.edu.au to arrange a group for you.
☛ Note: This is your own responsibility to contact your group from week 5 and work
together. You will be excluded from group if you failed to contact group or don’t
respond to emails/calls from your group. You must do your own work individually
once your group has subdivided the workload. No extension will be given for extra
group-collaboration or any other reasons.
☛ Note: You may indicate the tasks allocated or done by each member in the
submission cover page, if it is required.
1.3 What & how to submit
2. Submit a group cover page. Please ensure that all names of your group are in
the first cover page. Also include a list of submitted files in the cover page.
3. All files should be zipped into one file named with “YourName_DMS.zip”. Please
note, only ONE COPY of your submission is required. Other member of one group
should be on the cover page already. No duplicated submissions!
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 2
4. You must submit an electronic copy of your assignment via LMS under subject
CSE3DMS. Find assignment submission section in CSE3DMS page
(https://lms.latrobe.edu.au/course/view.php?id=77413).
☛ Note: It is student’s responsibility to ensure that they understand the submission
instructions. If you have ANY difficulties ask your Lecturer for assistance (prior to
the submission date).
☛ Note: Your personal computer problems cannot be used for the reason on delay
of your assignment.
☛ Note: You may be asked to attend a verbal assessment for this assignment that
would be used as a weighting factor on the final assignment mark.
1.4 Plagiarism
Plagiarism is the submission of somebody else’s work in a manner that gives the
impression that the work is your own.
Penalties for Academic Misconduct
There are serious penalties for academic misconduct. You may be asked to leave the
University and will not be able to complete your course if a serious academic misconduct
is involved in the case. Some students have even been barred from registration in their
chosen profession! http://www.latrobe.edu.au/students/academic-integrity
Penalties can be a mark of ZERO for the assignment.
1.5 Queries Regarding Assignment
Consultation time for part 1: 1 hour, 1 week before the assignment due date.
Consultation time for part 2: 2 hours, 1 week before the assignment due date.
Any queries can be addressed to the Lecturer during the normal consultation time
1.6 Marking Scheme Overview (including, but not limited to, the following)
The assignment will be assessed as two parts:
1. Development of Oracle Database and queries (50%)
• Implementation of the SQL scripts (15%) – part one
Do the database tables have appropriate data & data type?
• Queries Design and Structure (35%)
Do the SQL queries conform to the specifications?
Do the SQL queries answer the questions correctly and in a well-designed manner?
Do the queries get an optimized execution plan?
2. Database tuning analysis (50%)
• The executions of spool files (10%)
Provide the spool files including all queries and the execution plans to
demonstrate your executions of queries.
5. Documentation of Analysis & comparison (40%)
You must provide your analysis on the impact of query performance by using
different indexes used, the disk I/O access, and the response time etc.
Specify what types of indexes created on what attributes in the table are efficient
for what kinds of queries.
State your conclusion based on your execution of the queries.
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 3
1.7 Return of Assignments
Marks will be advertised on the subject website two weeks after the submission.
***
2 The Database Description
A large supermarket Good Food has over 2000 stores across the country. Each store is
made up of several departments: green grocery, frozen foods, general grocery, dairy, meat,
delicatessen, bakery and health/beauty. Each store has approximately 65,000 products
on its shelves. Individual products are known as stock keeping units (SKUs). About
60,000 of the SKUs come from outside manufacturers and have bar codes imprinted on
their packaging. The remaining 5,000 SKUs come from departments such as meat,
delicatessen or the bakery. The product are packed in various ways, such as paper carton,
plastic bag, paper bag, bottle, can and box etc. Product size is: LxWxH.
The Good Food supermarket has got a distributed database systems to manage the
operational business within the company. The database records all products, inventory,
stores, suppliers, employees and sales information. To help the supermarket organization
drives sales, marketing, loyalty, and service effectiveness, they also record all customers
to offer a VIP program. To ensure high customer numbers, management needs to carefully
consider its pricing and VIP discounts. The VIP can be classified by 5 levels:
1. Blue member, 0% discount
2. Silver member, 5% discount
3. Gold member, 8%
4. Diamond member, 10%
5. Platinum member, 12%
Employee’s level is between 1 to 10.
☛ Note: The data type of primary keys should be the integer type. The name of the
attributes in each table should be the same as in the diagram.
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 4
The following tables are the partial Supermarket Database schema:
Figure 1. E-R Diagram of the Good Food database schema
***
3 The Assignment Description
3.1 Phase 1. Due on 04 May [15 Marks]
You are requried to implement the Good Food supermarket Database schema (Shown
in Figure 1). You are required to use SQL*Loader to input data into large size of
database tables and use SQL commands to insert data into small size of tables.
Specifically, you need to
1. Implement and populate at least 30 records into each of the 5 large tables of
Customer, Product, Store, Employee and Sale transaction individually using
SQL*Loader.
2. Implement 5 smaller tables: Brand, Category, Department, Supplier, VIP
using SQL DML insertion commnads. Insert 10 records into Brand table,
Department table, Category and Supplier tables, 5 records into VIP.
You will Submit the following files in phase 1 by 04 May, 2020:
1. The database SQL scripts used on SQLPlus for creating Good Food supermarket
Database (all 10 tables) based on Figure 1; and insertion scripts for tables of
Brand, Category, Department, Supplier and VIP.
| Employee |
| E_id (PK) num E_Lname E_Fname E_Level num E_Sex E_DOB date E_ phone E_Street E_Suburb E_City E_State E_postcode num E_Date_started |
| Brand |
| Brand_id (PK) num Category_ID(fk) Brand_name Manufacture Supplier_ID(FK) |
| Department |
| Depart_ID (PK) Depart_name Depart_desc |
| Category |
| Category_ID (PK) Category _name Depart_ID(FK) |
| Product |
| Product _ID(PK) num Prod_name Prodt_Desc Brand_ID (FK) PackageType _desc Prod_unitprice PackageSize (LxHxW) Weight |
| Store |
| Store_id num (pk) Stoer_name Strore_ Street Store_Suburb Store_city Store_State Store_Postcode Store_ Phone Store_ Fax |
| Sale Transaction |
| Store_id (fk) Product_id fk) E_ID (fk) C_VIP_id (fk) Quantity Sale_date |
| Supplier |
| Supplier_ID (PK) Supp _name Supp_street Supp_suburb Supp_city Supp_state Supp_postcode Supp_tele Supp_fax |
| Customer |
| C _VIP_id (PK) C_Lname C_Fname VIP_Level(FK) C_Sex C_DOB C_Phone C_Street C_Suburb C_City C_ State C_ postcode C_Date_joined |
| VIP |
| VIP_level (PK) VIP_desc VIP_discount |
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 5
2. Five Control files will be used for SQL*Loader.
3. Five flat data files: Customer.dat, Product.dat, Store.dat, Employee.dat and
Sale_transaction.dat. (Hint: create your data files by Excel and save as *.csv. Then
change the files’ extension .csv to .dat)
4. The command scripts used for SQL*loader (all commands can be in one file).
***
3.2 Phase 2: Tuning Database performance [85 marks] due on 01 June, 2020.
Step 1. SQL queries
You answer following queries that will be used for comparisons and analysis the query
performance in next question for better executions of the queries.
1) Find the total sales between 01/03/2015 to 31/03/2017 for the most expensive
product of fragrance (category name) at all stores in city Melbourne. Write three
different queries to find the best performance query. One query contains one
subquery in FROM clause, one may contain subqueries in WHERE clause and the
other one may use WITH clause to create a tempory table in the query.
Hint: First part of the query is to get the total sales of Product in Category Fragrance; then write
a subquery to get a list of the product_ids whose unit prices are the same as the max
Product_UnitPrice at the store Bundoora.
Hint: Use date function to tell your date format TO_DATE(’31/03/2017′, ‘DD/MM/yyyy’). Using
function upper(..) for string comparison.
Output:
2) Find Melbourne VIP level 1, female and Melbourne customer names who are have
bought the “Ice Cream” at least 2 times in database. You write three different queries:
one is using operator EXISTS and the other one is using operator IN. One of the
queries with the main filter criteria in FROM clause of the main query. (Hint: using
Group by c _VIP_id having count(*)>1) Find one with the better performance.
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 6
Output:
3) Find the list of employee names who sell products in these stores which their total
sales are the lowest total sales in the database. Write two different queries: one query
contains one or more sub-queries in FROM or WHERE clause and the other one
contains WITH clause (hint: get a two colomn table (store_id, total_sales) sales), get
a min total sale, and get all stoer_ids for the min total sales.)
An example for a nested subquery in FROM clause:
SELECT MIN(ST.sum_column1)
FROM (SELECT store_ID, SUM(sale_price) AS sum_column1
FROM Sale_transct GROUP BY store_ID) ST
WHERE …;
Outptut:
4) Retrieve the names of Employees who have the same E_level values as Jorge Peres
and also are living at the same suburb as Jorge Peres. Write two different queries
with nested queries. (hint: one query uses WITH Clause to define Jorge’s level and
the suburb; while the second query is required to get the level and the subub of
“Jorge Peres-level table” in your WHERE clause Or FROM clause)
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 7
Eg.,
SELECT s_fname, s_lname
| FROM | student s, enrollment e, |
| (SELECT | s_level |
| FROM student s WHERE UPPER(s.s_lname) = ‘PERES’ |
|
| AND ….. |
UPPER(s.s_fname) = ‘JORGE’) slevel, /slevel is a table/ |
| WHERE |
SELECT e.e_fname, e.e_lname
FROM Employee e
WHERE e.e_level = (
SELECT e_level
FROM Employee
WHERE UPPER(e_lname) = ‘PERES’
AND UPPER(e_fname) = ‘JORGE’)
Answer1&2: 40&41 Rows in the result
5) Display the names of products that bought by Female VIPs who are in level 3, level
4, and level 5 (not level 1 or 2) and a list of all products supplied by supplier’s
postcode 3083. Write two queries using UNION & UNION ALL (using distinct).
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 8
Step 2. Tuning queires and analyze different queries – using execution plan with
indexes
In this questions, you need to study and observe the performance impact of the
queries from Step 1. To check if you have used unnecessary distinct, unnecessary
nested sub-queries and unnecessary join or set operators in your queries.
You can use the following different techniques to tune your queries. Compare and
analyze the query results by looking into the execution plan. You understand the
optimizer’s decisions and analyse the EXPLAIN PLAN results by looking into the
execution plans of one query question
• Get the exaction plan of each query to compare the response time, the disk access,
the indexes usage and the join methods etc. of different queries you have written.
o The ordering of the tables referenced by the statement
o The access method for each table mentioned in the statement
o The join method for tables affected by join operations in the statement
o Data operations like filter, conditions
• Check if you should define appropriate indexes (bitmap, secondary indexes or a
function based index etc).
• When you are tuning queries, you may use Hints. The hints may be used after
you have collected statistics on the relevant tables and evaluated the optimizer
plan. Monitoring query performance enhancements to see whether any significant
impact on queries by hints.
• You need to spool each query execution for the EXPLAIN PLAN of both original
queries and optimized/modified queries (if there are any) to learn how the
optimizer is executing a query.
☛ Notice: EXPLAIN PLAN is to get a plan before the query execution. AUTOTRACE is
for actual execution of queries.
In phase two, you should have the following files in your submission:
• SQL query scripts file – can be in one YourName-SQL.sql file
Database Management Systems CSE3DMS
Due 10:00 am Monday 04 May & 01 june 2020 Page 9
• Spool files for executions of queries with their execution plans from SQL*Plus – you
can concatenate all spool files into one file for submission.
• The table of your comparisons on database query execution plans and your analysis
with your conclusions.
You may provide a table to tell the execution plans of each query:
| Query No. |
Tables accesse d |
Filter condition |
No. Full table access $ No. of query blocks |
Join Method |
Index Type & attributes |
Join order |
other (result size, Plan/afte r executio n |
Conclusio ns |
| … | … |
***
The post Database Management Systems CSE3DMS appeared first on My Assignment Online.