SQL Aggregate Queries

Similar documents
Database Management Systems. Chapter 5

CSIT5300: Advanced Database Systems

RESERVATION.Sid and RESERVATION.Bid are foreign keys referring to SAILOR.Sid and BOAT.Bid, respectively.

RESERVATION.Sid and RESERVATION.Bid are foreign keys referring to SAILOR.Sid and BOAT.Bid, respectively.

CSIT5300: Advanced Database Systems

Comp115: Databases. Relational Algebra

sname rating 8 .forward Relational Algebra Operator Precedence Sample Query 0 Example Schema bid 103 sname( ( Sailors) Relational Algebra Queries

CS 461: Database Systems. Relational Algebra. supplementary material: Database Management Systems Sec. 4.1, 4.2 class notes

{ } Plan#for#Today# CS#133:#Databases# RelaKonal#Calculus# Rel.#Alg.#Compound#Operator:# Division# A B = x y B( x, y A)

Homework 2: Relational Algebra and SQL Due at 5pm on Wednesday, July 20, 2016 NO LATE SUBMISSIONS WILL BE ACCEPTED

Healthcare Analytics Anticoagulation Time in Therapeutic Range Calculation Documentation September 4, 2015

FÉDÉRATION INTERNATIONALE DE BOBSLEIGH ET DE TOBOGGANING (FIBT) Bobsleigh

SQL language: set operators

Database Systems CSE 414

INTERNATIONAL FEDERATION OF BOBSLEIGH AND TOBOGGANING (FIBT)

Decision Trees. Nicholas Ruozzi University of Texas at Dallas. Based on the slides of Vibhav Gogate and David Sontag

TAKE OFF! THE TRAMPOLINE GYMNASTICS NEWSLETTER N 22 November 2016

averun.com Installation Guide

Relational Schema Design. Part II: Schema Decomposition. Example of Bad Design. Result of bad design: Anomalies

The FA Whole Game System Guide Managing your FA Charter Standard Annual Health Check

Using SQL in MS Access

INSTRUCTION FOR FILLING OUT THE JUDGES SPREADSHEET

Measuring Relative Achievements: Percentile rank and Percentile point

RELATIVE PLACEMENT SCORING SYSTEM

Limit/Imbalance Violation and Report - Manual

2017 Census Reporting To access the SOI s Census Reporting web site go to:

Aircraft Stability and Control Prof. A. K. Ghosh Department of Aerospace Engineering Indian Institute of Technology-Kanpur. Lecture- 25 Revision

EDJBA. Stadium Scoring. Referee Guidelines

2012 SCORING MANUAL. Introduction of the VJBL Adjusted FIBA SCORESHEET. Date Issued 21 st February 2011

Summary of SAGA Handicap Manual

WSF ASSESSMENT SHEET

HOW TO SETUP ROUND ROBIN IN DARTS FOR WINDOWS

CRL Processing Rules. Santosh Chokhani March

CS 500: Fundamentals of Databases. Midterm review. Julia Stoyanovich

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Wenbing Zhao. Department of Electrical and Computer Engineering

Outline. Terminology. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 6. Steps in Capacity Planning and Management

VERIFY THAT THIS IS THE CORRECT WRESTLER.

Lecture 1.4: Rules of Inference. CS 250, Discrete Structures, Fall Nitesh Saxena. Adopted from previous lectures by Cinda Heeren

Planning and Running a Judging Competition

USER MANUAL

English. English. Predictive Multi Gas for

[CROSS COUNTRY SCORING]

Electronic Scrutineering and Electronic Judging of IDSF Competitions Turnier-Protokoll-System

MIS0855: Data Science In-Class Exercise: Working with Pivot Tables in Tableau

Archery Australia Inc High Performance Committee

Rules for the Mental Calculation World Cup 2018

USA Jump Rope Tournament Software User Guide 2014 Edition

NCSS Statistical Software

Evaluating and Classifying NBA Free Agents

Race Screen: Figure 2: Race Screen. Figure 3: Race Screen with Top Bulb Lock

Taking Your Class for a Walk, Randomly

McKnight Hockey Association

Multi Class Event Results Calculator User Guide Updated Nov Resource

Dive Planet. Manual. Rev Basic User Interface. 2 How to organize your dives. 3 Statistics. 4 Location Service and Map View.

Legendre et al Appendices and Supplements, p. 1

Should pitchers bat 9th?

Team BUSY : Excise register RG-23 A-11, provision made to input starting S.No.

WSWCF STREET WORKOUT NATIONAL CHAMPIONSHIP 2018 OFFICIAL COMPETITION REGULATIONS FOR QUALIFICATION TO

SENIOR RANKING SCHEME With effect from 1st September 2013

How to Set Up Your League

Summary and Comparison of the Data Used by the TTO and CRS Reelection Analyses

Real-World Performance Training Hands-On Exercise

A physicist, an engineer and a programmer were in a car driving over a steep alpine pass when the brakes failed. The car was getting faster and

This Selection Policy is to be read in conjunction with the Selection Guidelines.

Summary Report for Individual Task Compute Position Minimum Quadrant Elevation (Min QE) Status: Approved

resultszone Post Result Service

Test Refresh: Timeline & FAQs

Balanced Binary Trees

FUBA WORLD CUP TOURNAMENT RULES

IODA WORLD SAILING CHAMPIONSHIP 2009 PRELIMINARY NOTICE OF RACE

[CROSS COUNTRY SCORING]

Analysis of NAPTF Trafficking Dynamic Response Data For Pavement Deformation Behavior. Center of Excellence for Airport Technology, CEAT

CONTROL: 15 EYOF Policy UPDATED: 09/12/2014 DOC EXPIRES: 1 August 2015

Tension Cracks. Topics Covered. Tension crack boundaries Tension crack depth Query slice data Thrust line Sensitivity analysis.

Algebra I: Strand 3. Quadratic and Nonlinear Functions; Topic 1. Pythagorean Theorem; Task 3.1.2

dplyr & Functions stat 480 Heike Hofmann

SUPERPAVE MIX DESIGN. Superpave Mix Design 1

Section 8: Model-View-Controller. Slides adapted from Alex Mariakakis, with material from Krysta Yousoufian and Kellen Donohue

Level 4 X Level 5 Level 6 Level 7 Level 8 Mark the box to the right of the appropriate level with an X

3.0. GSIS 3.1 Release Notes. NFL GSIS Support: (877) (212)

Module 4.0 Calculating the Final Score

Workshops & Tours WEDNESDAY. 7:30am-9:30pm. June 27. Library Tour 8:30 am 9:00 am and 12:30 pm 1:00 pm

Summary of SAGA Handicap Manual

1 Document history Version Date Comments

AGENDA STAFF REPORT. CEO CONCUR COUNTY COUNSEL REVIEW CLERK OF THE BOARD Pending Review Approved Agreement to Form Discussion 3 Votes Board Majority

Chapter 6 Collegiate Racing

Administration and IT Assignment Assessment Task: Sounds on the Shore

Navigate to the golf data folder and make it your working directory. Load the data by typing

What does FORCE do? FP 6: 2 ways to submit Forms C Excel sheets (as before) Highly recommended to use as from now FORCE

Advanced Search Hill climbing

ISDS 4141 Sample Data Mining Work. Tool Used: SAS Enterprise Guide

Naïve Bayes. Robot Image Credit: Viktoriya Sukhanova 123RF.com

September 29, New type of file on Retrosheet website. Overview by Dave Smith

HORIZONT. TWS/Webadmin. Guide for upgrading TWS on z/os to newer version. Software for Datacenters

ISIS Data Cleanup Campaign: Guidelines for Studbook Keepers

Frequently asked Questions (FAQ) Unwillingness to Fight (U2F)

SUGGESTED LISTS OF FORMS & SUPPLIES NEEDED FOR ILR-SD YOUTH JUDGING

European Athletics Championships 2018 Entry Standards & Conditions as of 8 December 2018

Technology issues regarding Blends of Refrigerants Buffalo Research Laboratory

Edexcel GCSE. Mathematics A 1387 Paper 5523/03. November Mark Scheme. Mathematics A 1387

Transcription:

SQL Aggregate Queries CS430/630 Lecture 8 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke

Aggregate Operators Significant extension of relational algebra COUNT (*) COUNT ( [DISTINCT] A) SUM ( [DISTINCT] A) AVG ( [DISTINCT] A) MAX (A) MIN (A) A is a single column Result is single value obtained by applying aggregate over all qualifying tuples SELECT COUNT (*)

Aggregate Queries Examples SELECT AVG (S.age) WHERE S.rating=10 SELECT COUNT (DISTINCT S.rating) WHERE S.sname= Bob SELECT S.sname WHERE S.rating= (SELECT MAX(S2.rating) 2) Aggregate + nested!

Common Mistake with Aggregates SELECT S.sname, MAX (S.age) Illegal Query! Can t have both aggregates and non-aggregates in SELECT Exception: GROUP BY (later in this class) Reason: it is not guaranteed that there is only one tuple with the MAX value

Grouping Results So far, aggregates applied to all (qualifying) tuples We may want to apply them to each of several groups Find the age of the youngest sailor for each rating level In general, we don t know how many rating levels exist, and what the rating values for these levels are! Suppose we know that rating values go from 1 to 10 SELECT MIN (S.age) WHERE S.rating = 1 SELECT MIN (S.age) WHERE S.rating = 10 SELECT MIN (S.age) WHERE S.rating = 2 How to achieve this?

Queries With GROUP BY and HAVING The target-list contains: (i) attribute names list (ii) terms with aggregate operations (e.g., MIN (S.age)) The attribute list (i) must be a subset of grouping-list SELECT [DISTINCT] target-list FROM relation-list WHERE qualification GROUP BY grouping-list HAVING group-qualification A group is a set of tuples that have the same value for all attributes in grouping-list Each answer tuple corresponds to a group, so these attributes must have a single value per group.

Conceptual Evaluation 1. Compute cross-product of relation-list 2. Discard tuples that fail qualification, unnecessary fields are deleted 3. Remaining tuples are partitioned into groups by the value of attributes in grouping-list 4. Discard groups that fail group-qualification Expressions in group-qualification must have a single value per group! An attribute in group-qualification that is not an argument of an aggregate operation must appear in grouping-list (unless EVERY or ANY used) 5. Generate single answer tuple per qualifying group

GROUPBY Query Example Find age of the youngest sailor with age at least 18, for each rating with at least 2 such sailors SELECT S.rating, MIN (S.age) AS minage WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 sid sname Sailors rating age 22 dustin 7 45.0 29 brutus 1 33.0 31 lubber 8 55.5 32 andy 58 rusty 10 35.0 64 horatio 71 zorba 10 16.0 74 horatio 9 35.0 85 art 95 bob 3 63.5 96 frodo

GROUPBY Conceptual Evaluation Example Find age of the youngest sailor with age at least 18, for each rating with at least 2 such sailors rating age 7 45.0 1 33.0 8 55.5 10 35.0 10 16.0 9 35.0 3 63.5 rating age 1 33.0 3 63.5 7 45.0 8 55.5 9 35.0 10 35.0 rating minage

More Group Qualification Functions So far, we have seen group qualification based on a property of the group E.g., aggregate function computed for entire group But recent SQL standard version allow group qualification based on a property of individual records EVERY(condition): TRUE if condition holds for every group tuple ANY(condition): TRUE if condition holds for some group tuple

Find age of the youngest sailor with age 18, for each rating with at least 2 such sailors and with every sailor under 60. HAVING COUNT (*) > 1 AND EVERY (S.age <=60) rating age 7 45.0 1 33.0 8 55.5 10 35.0 10 16.0 9 35.0 3 63.5 rating age 1 33.0 3 63.5 7 45.0 8 55.5 9 35.0 10 35.0 rating minage

Pay attention to order of steps! HAVING executes AFTER WHERE Find age of the youngest sailor with age >= 18, for each rating with at least 2 sailors (of any age) SELECT S.rating, MIN (S.age) WHERE S.age >= 18 GROUP BY S.rating HAVING COUNT (*) > 1 WRONG!!!

Find age of the youngest sailor with age >= 18, for each rating with at least 2 sailors (of any age) rating age rating age rating age 7 45.0 1 33.0 8 55.5 10 35.0 10 16.0 9 35.0 3 63.5 7 45.0 1 33.0 8 55.5 10 35.0 10 16.0 9 35.0 3 63.5 1 33.0 3 63.5 7 45.0 8 55.5 9 35.0 10 35.0 rating minage 10 35.0

Pay attention to order of steps! Find age of the youngest sailor with age >= 18, for each rating with at least 2 sailors (of any age) SELECT S.rating, MIN (S.age) WHERE S.age >= 18 GROUP BY S.rating HAVING 1 < (SELECT COUNT (*) 2 WHERE S.rating=S2.rating) HAVING executes AFTER WHERE HAVING clause can also contain a subquery!

Summary of cases INFORMAL! Can group validation condition be evaluated on intermediate relation alone? If NO, then we need subquery in HAVING If YES, then we do not need subquery, and we have two further cases: Group validation condition DOES NOT depend on individual tuples in group, only aggregates and group-by attributes appear in the HAVING clause Group validation DOES depend on individual tuples in group, in which case non-group-by attributes may appear with ANY or EVERY operator Note: this is just a guideline for most cases, it is actually possible to have a mix of the above!!!

Aggregates and FROM Subqueries Aggregate operations cannot be nested! Find rating that has lowest average sailor age SELECT S.rating WRONG WHERE S.age = (SELECT MIN (AVG (S2.age)) 2) Correct solution: SELECT Temp.rating, Temp.avgage FROM (SELECT S.rating, AVG (S.age) AS avgage GROUP BY S.rating) Temp WHERE Temp.avgage = (SELECT MIN (Temp.avgage) FROM Temp)