Hobby

MySQL Ambiguity in Selecting Row Holding the Maximum of a Certain Column


Spring Datafication 2022. 7. 28. 11:29

Suppose we create a table tbl_fas

CREATE TABLE tbl_fas (
                      article INT UNSIGNED  DEFAULT '0000' NOT NULL,
                      dealer  CHAR(20)      DEFAULT ''     NOT NULL,
                      price   DECIMAL(16,2) DEFAULT '0.00' NOT NULL,
                      PRIMARY KEY(article, dealer));

and we insert some data

INSERT INTO tbl_fas VALUES
                     (1,'A',3.45),(1,'B',3.99),(2,'A',10.99),(3,'B',1.45),
                     (3,'C',1.69),(3,'D',1.25),(4,'D',19.95);

Suppose our goal is to find the most expensive article fas reads.
We can do this three simple ways

1 LEFT JOIN

SELECT f.article, f.dealer, f.price
FROM tbl_fas f
LEFT JOIN tbl_fas f1 ON f.price < f1.price
WHERE f1.article IS NULL;

A query analysis gives us.
Which is not bad for small number of items to compare

[
  {
    "id": 1,
    "select_type": "SIMPLE",
    "table": "f",
    "type": "ALL",
    "possible_keys": null,
    "key": null,
    "key_len": null,
    "ref": null,
    "rows": "7",
    "r_rows": "7.00",
    "filtered": 100,
    "r_filtered": 100,
    "Extra": ""
  },
  {
    "id": 1,
    "select_type": "SIMPLE",
    "table": "f1",
    "type": "ALL",
    "possible_keys": null,
    "key": null,
    "key_len": null,
    "ref": null,
    "rows": "7",
    "r_rows": "7.00",
    "filtered": 100,
    "r_filtered": 14.29,
    "Extra": "Using where; Not exists; Using join buffer (flat, BNL join)"
  }
]
article dealer price
4 D 19.95

2 Subquery

SELECT article, dealer, price
FROM   tbl_fas
WHERE  price=(SELECT MAX(price) FROM tbl_fas);

again the query analysis here is the subquery that gives the difference.

[
  {
    "id": 1,
    "select_type": "PRIMARY",
    "table": "tbl_fas",
    "type": "ALL",
    "possible_keys": null,
    "key": null,
    "key_len": null,
    "ref": null,
    "rows": "7",
    "r_rows": "7.00",
    "filtered": 100,
    "r_filtered": 14.29,
    "Extra": "Using where"
  },
  {
    "id": 2,
    "select_type": "SUBQUERY",
    "table": "tbl_fas",
    "type": "ALL",
    "possible_keys": null,
    "key": null,
    "key_len": null,
    "ref": null,
    "rows": "7",
    "r_rows": "7.00",
    "filtered": 100,
    "r_filtered": 100,
    "Extra": ""
  }
]
article dealer price
4 D 19.95

3 LIMIT clause

Lastly we can do this with just the limit clause.

SELECT article, dealer, price
FROM tbl_fas
ORDER BY price DESC
LIMIT 1;

The json view of the query analysis shows simply one query

[
  {
    "id": 1,
    "select_type": "SIMPLE",
    "table": "tbl_fas",
    "type": "ALL",
    "possible_keys": null,
    "key": null,
    "key_len": null,
    "ref": null,
    "rows": "7",
    "r_rows": "1.00",
    "filtered": 100,
    "r_filtered": 100,
    "Extra": "Using filesort"
  }
]
article dealer price
4 D 19.95

In most cases, It is easy to use the limit clause query.

But what happens when the maximum value of price(19.95) is more than 1 ?!.
Lets try to make changes to the table and try again.

UPDATE test.tbl_fas t SET t.price = 19.95 WHERE t.article = 2 AND t.dealer LIKE 'A' ESCAPE '#'

The results of the queries below gives two rows.

SELECT * FROM tbl_fas WHERE  price=(SELECT MAX(price) FROM tbl_fas);
---
SELECT s1.article, s1.dealer, s1.price
FROM tbl_fas s1 LEFT JOIN tbl_fas s2 ON s1.price < s2.price
WHERE s2.article IS NULL;
article dealer price
2 A 19.95
4 D 19.95

Clearly we have two maximum values.

The Limit query

SELECT article, dealer, price
FROM tbl_fas
ORDER BY price DESC
LIMIT 1;

returns only

article dealer price
2 A 19.95

based on the order condition with limit clause the results is ONLY one. Which is definitely ambiguous if
this is not the row of data we expected.

When not to use LIMIT Clause

  1. SELECT statement (where max != 1).
  2. When distinct values of max,min is not CERTAIN
반응형