Carpe Diem

備忘録。https://github.com/jun06t

MongoDBのAggregationとSQLの比較

概要

MongoDBのAggregationとSQLとの比較です。
どうやって集計するかな?と悩んだ時に慣れているSQLとの比較があると分かりやすいと思って公式ドキュメントに沿った形で載せてます。

環境

  • MongoDB 3.0.7

オペレータの比較

SQL MongoDB
WHERE $match
GROUP BY $group
HAVING $match
SELECT $project
ORDER BY $sort
LIMIT $limit
SUM() $sum
COUNT() $sum

データ構造

以下のデータ構造で進めるとします。

{
  cust_id: "abc123",
  ord_date: ISODate("2012-11-02T17:04:11.102Z"),
  status: 'A',
  price: 50,
  items: [ { sku: "xxx", qty: 25, price: 1 },
           { sku: "yyy", qty: 25, price: 1 } ]
}

Example

COUNT()

SQL

SELECT COUNT(*) AS count
FROM orders

Mongo

db.orders.aggregate( [
   {
     $group: {
        _id: null,
        count: { $sum: 1 }
     }
   }
] )

SUM()

SQL

SELECT SUM(price) AS total
FROM orders

Mongo

db.orders.aggregate( [
   {
     $group: {
        _id: null,
        total: { $sum: "$price" }
     }
   }
] )

GROUP BY

SQL

SELECT cust_id, SUM(price) AS total
FROM orders
GROUP BY cust_id

Mongo
_idに対象のフィールドを指定します。

db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   }
] )

複数のGROUP BY

SQL

SELECT cust_id,
       ord_date,
       SUM(price) AS total
FROM orders
GROUP BY cust_id,
         ord_date

Mongo
_idにネストする形で指定します。フィールド名($がついていない方)が出力時のフィールド名になります。

db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
               month: { $month: "$ord_date" },
               day: { $dayOfMonth: "$ord_date" },
               year: { $year: "$ord_date"}
           }
        },
        total: { $sum: "$price" }
     }
   }
] )

ORDER BY

SQL

SELECT cust_id,
       SUM(price) AS total
FROM orders
GROUP BY cust_id
ORDER BY total

Mongo

db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   },
   { $sort: { total: 1 } }
] )

HAVING

SQL

SELECT cust_id,
       count(*)
FROM orders
GROUP BY cust_id
HAVING count(*) > 1

Mongo

db.orders.aggregate( [
   {
     $group: {
        _id: "$cust_id",
        count: { $sum: 1 }
     }
   },
   { $match: { count: { $gt: 1 } } }
] )

WHERE

SQL

SELECT cust_id,
       SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id

Mongo

db.orders.aggregate( [
   { $match: { status: 'A' } },
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   }
] )

WHERE と HAVING

SQL

SELECT cust_id,
       SUM(price) as total
FROM orders
WHERE status = 'A'
GROUP BY cust_id
HAVING total > 250

Mongo

db.orders.aggregate( [
   { $match: { status: 'A' } },
   {
     $group: {
        _id: "$cust_id",
        total: { $sum: "$price" }
     }
   },
   { $match: { total: { $gt: 250 } } }
] )

サブクエリ

SQL

SELECT COUNT(*)
FROM (SELECT cust_id,
             ord_date
      FROM orders
      GROUP BY cust_id,
               ord_date)
      as DerivedTable

Mongo

db.orders.aggregate( [
   {
     $group: {
        _id: {
           cust_id: "$cust_id",
           ord_date: {
               month: { $month: "$ord_date" },
               day: { $dayOfMonth: "$ord_date" },
               year: { $year: "$ord_date"}
           }
        }
     }
   },
   {
     $group: {
        _id: null,
        count: { $sum: 1 }
     }
   }
] )

ソース