# ElasticSearch Bucket Aggregation

img


# ์‹ค์Šต ํ™˜๊ฒฝ

  • ๐Ÿ’ก Elasticsearch 7.9.0
  • ๐Ÿ’ก Windows 10
  • ๐Ÿ’ก Git Bash

# Aggregation์ด๋ž€?

๊ฐ„๋‹จํžˆ ์„ค๋ช…ํ•˜๋ฉด ElasticSearch์•ˆ Document์˜ ์กฐํ•ฉ์„ ํ†ตํ•ด ๊ฐ’์„ ๋„์ถœํ•  ๋•Œ ์“ฐ์ด๋Š” ๋ฐฉ๋ฒ•์ด๋‹ค.

๊ทธ ์ค‘ Bucket Aggregation์€ group by๋กœ ์ดํ•ดํ•˜๋ฉด ๋œ๋‹ค.

๊ทธ๋ฃน์œผ๋กœ ํŠน์ • ์ง€์„ ๋•Œ ์‚ฌ์šฉํ•˜๋ฉด ์œ ์šฉํ•˜๋‹ค.

์ด์ „ ์‹ค์Šต์—์„œ ๋งŒ๋“ค์—ˆ๋˜ basketball index๋ฅผ ์‚ญ์ œํ•˜๊ณ  ๋‹ค์‹œ ์ƒ์„ฑํ•ด์„œ

Mapping ๋ถ€ํ„ฐ ์‹œ์ผœ๋ณด์ž.

# index ์‚ญ์ œํ•˜๊ธฐ

curl -XDELETE http://localhost:9200/basketball

# index ์ƒ์„ฑํ•˜๊ธฐ

curl -XPUT localhost:9200/basketball


# Type Mapping

์ด์ œ ์ƒˆ๋กœ์šด index์— ๋ฐ์ดํ„ฐ๋ฅผ Mapping์‹œ์ผœ๋ณด์ž.

basketball_mapping.json ํŒŒ์ผ ๋‚ด์šฉ

{
	"record" : {
		"properties" : {
			"team" : {
				"type" : "text",
				"fielddata" : true
			},
			"name" : {
				"type" : "text",
				"fielddata" : true
			},
			"points" : {
				"type" : "long"
			},
			"rebounds" : {
				"type" : "long"
			},
			"assists" : {
				"type" : "long"
			},
			"blocks" : {
				"type" : "long"
			},
			"submit_date" : {
				"type" : "date",
				"format" : "yyyy-MM-dd"
			}
		}
	}
}

recode๋ผ๋Š” type ์•ˆ์— ๋‹ค์–‘ํ•œ properties๊ฐ€ ์žˆ๋Š”๋ฐ

fielddata : true ๊ฐ’์€ aggregationํ•  ๋•Œ ์กฐํšŒํ•  ์ˆ˜ ์žˆ๋„๋ก ์„ค์ •ํ•ด๋‘์—ˆ๋‹ค.

Mapping์„ ES์— ์ ์šฉํ•ด๋ณด์ž.

$ curl -XPUT 'http://localhost:9200/basketball/record/_mapping?include_type_name=true&pretty' -d @basketball_mapping.json -H 'Content-Type: application/json'

# Documents์— Sample data Bulk ํ•˜๊ธฐ

Sample data๋Š” ์•„๋ž˜์™€ ๊ฐ™๋‹ค.

twoteam_basketball.json ํŒŒ์ผ ๋‚ด์šฉ

{ "index" : { "_index" : "basketball", "_type" : "record", "_id" : "1" } }
{"team" : "Chicago","name" : "Michael Jordan", "points" : 30,"rebounds" : 3,"assists" : 4, "blocks" : 3, "submit_date" : "1996-10-11"}
{ "index" : { "_index" : "basketball", "_type" : "record", "_id" : "2" } }
{"team" : "Chicago","name" : "Michael Jordan","points" : 20,"rebounds" : 5,"assists" : 8, "blocks" : 4, "submit_date" : "1996-10-13"}
{ "index" : { "_index" : "basketball", "_type" : "record", "_id" : "3" } }
{"team" : "LA","name" : "Kobe Bryant","points" : 30,"rebounds" : 2,"assists" : 8, "blocks" : 5, "submit_date" : "2014-10-13"}
{ "index" : { "_index" : "basketball", "_type" : "record", "_id" : "4" } }
{"team" : "LA","name" : "Kobe Bryant","points" : 40,"rebounds" : 4,"assists" : 8, "blocks" : 6, "submit_date" : "2014-11-13"}

์œ„์—์„œ ํ™•์ธํ•œ sample ๋ฐ์ดํ„ฐ๋ฅผ Bulk ํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์Œ ๋ช…๋ น์„ ์ž…๋ ฅํ•˜์ž.

# โœ” Bulk ํ•˜๊ธฐ

$ curl -XPOST http://localhost:9200/_bulk?pretty --data-binary @twoteam_basketball.json -H 'Content-Type: application/json'

JSON ํŒŒ์ผ์€ ํ•ญ์ƒ ๋งˆ์ง€๋ง‰์— newline์„ ์‚ฝ์ž…ํ•ด์ฃผ์ž.


# Term Aggregation ์‹ค์Šต

# Group by Team!

Team์œผ๋กœ ๊ทธ๋ฃน์„ ๋‚˜๋ˆ ๋ณด์ž.

terms_aggs.json ํŒŒ์ผ ๋‚ด์šฉ

{
	"size" : 0,
	"aggs" : {
		"players" : {
			"terms" : {
				"field" : "team"
			}
		}
	}
}

size:0 - ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ ์ •๋ณด๋ฅผ ํ‘œ์‹œํ•˜์ง€ ์•Š๊ณ  ๊ฒฐ๊ณผ๋งŒ ๋„์ถœ

players - Aggregation name

terms - term Aggregation์„ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ์ •์˜


์•„๋ž˜ ๋ช…๋ น์–ด๋กœ Term์„ ํ™•์ธํ•ด๋ณด์ž

$ curl -XGET 'http://localhost:9200/_search?pretty' --data-binary @terms_aggs.json -H 'Content-Type: application/json'

ํ˜„์žฌ ํŒ€ ์ƒํ™ฉ์„ ํ‘œ๋กœ ๋‚˜ํƒ€๋‚ด๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

Doc1 Doc2 Doc3 Doc4
Team Chicago Chicago LA LA

Term Aggregraion ๊ฒฐ๊ณผ ์˜ˆ์ƒํ–ˆ๋˜๋Œ€๋กœ ๊ฐ ํŒ€์ด 2๊ฐœ์”ฉ count ๋œ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

๊ฒฐ๊ณผ

{
  # ... ์ƒ๋žต
  "aggregations" : {
    "players" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "chicago",
          "doc_count" : 2
        },
        {
          "key" : "la",
          "doc_count" : 2
        }
      ]
    }
  }
}

# ๋ณต์žกํ•œ ํ†ต๊ณ„ ๋ถ„์„ ์˜ˆ์ œ

ํ†ต๊ณ„๋‚˜ ๋ถ„์„์œผ๋กœ ๋ณด๊ธฐ์—” ์กฐ๊ธˆ ๋ฌด๋ฆฌ๊ฐ€ ์žˆ์–ด ๊ณ ๋ คํ•  ์‚ฌํ•ญ์„ ๋Š˜๋ ค์„œ ์‹ค์Šตํ•ด๋ณด์ž.

์‹ค์ œ ๋†๊ตฌ๊ฒฝ๊ธฐ ์ฒ˜๋Ÿผ ๊ฐ’์„ ์ ์œผ๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด ์ž๋ฃŒ๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

Doc1 Doc2 Doc3 Doc4
Team Chicago Chicago LA LA
Name Michael Michael Kobe Kobe
Points 30 20 30 40
Rebounds 3 5 2 4
Assists 4 8 8 8
blocks 3 4 5 6

ํŒ€์„ ๋ถ„๋ฅ˜ํ•˜๊ณ , ๊ฐ ํŒ€ ๋ณ„๋กœ ์„ฑ์ ์„ ๋ณด๋Š” ํ†ต๊ณ„๋ฅผ ๋งŒ๋“ค์–ด๋ณด์ž.

{
	"size" : 0,
	"aggs" : {
		"team_stats" : {
			"terms" : {
				"field" : "team"
			},
			"aggs" : {
				"stats_score" : {
					"stats" : {
						"field" : "points"
					}
				}
			}
		}
	}
}
  • ํŒ€๋ณ„๋กœ document๋ฅผ ๋ฌถ์–ด์ฃผ๊ณ 
  • ๊ฐ ํŒ€๋ณ„๋กœ ์ ์ˆ˜๋ณ„ stats ํ†ต๊ณ„๋ฅผ ๋ฐ˜ํ™˜
$ curl -XGET http://localhost:9200/_search?pretty --data-binary @stats_by_team.json -H 'Content-Type: application/json'

๊ฒฐ๊ณผ

# ... ์ƒ๋žต
"aggregations" : {
    "team_stats" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "chicago",
          "doc_count" : 2,
          "stats_score" : {
            "count" : 2,
            "min" : 20.0,
            "max" : 30.0,
            "avg" : 25.0,
            "sum" : 50.0
          }
        },
        {
          "key" : "la",
          "doc_count" : 2,
          "stats_score" : {
            "count" : 2,
            "min" : 30.0,
            "max" : 40.0,
            "avg" : 35.0,
            "sum" : 70.0
          }
        }
      ]
    }
#... ์ƒ๋žต

ChicagoํŒ€๊ณผ LAํŒ€ ๊ฐ๊ฐ ์ ์ˆ˜ ํ†ต๊ณ„๊ฐ€ ๋‚˜ํƒ€๋‚˜๋Š” ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.


๋ณธ ํฌ์ŠคํŒ…์€ Inflearn์˜ ELK ์Šคํƒ (ElasticSearch, Logstash, Kibana) ์œผ๋กœ ๋ฐ์ดํ„ฐ ๋ถ„์„ (opens new window) ๊ฐ•์˜๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ ์ž‘์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.


Last Updated: 6/18/2023, 2:13:15 PM