• Mindscape ๐Ÿ”ฅ
    • Playlist ๐ŸŽง
  • Ubuntu

    • Wine ํ™˜๊ฒฝ์—์„œ ์นด์นด์˜คํ†ก ์‹คํ–‰ ์‹œ explorer.exe ๋œจ์ง€ ์•Š๊ฒŒ ํ•˜๋Š” ๋ฒ•
    • Wine ์นด์นด์˜คํ†ก ์Šคํฌ๋ฆฐ์ƒท ๋ถ™์—ฌ๋„ฃ๊ธฐ ๋ฌธ์ œ ํ•ด๊ฒฐ (Wayland ํ™˜๊ฒฝ)
    • Wine ์นด์นด์˜คํ†ก ์ด๋ชจ์ง€ ๊นจ์ง ๋ฌธ์ œ ํ•ด๊ฒฐ
  • Wellness

    • ์ฐจ์ „์žํ”ผ (Psyllium Husk)
    • ์—‘์ŠคํŠธ๋ผ ๋ฒ„์ง„ ์˜ฌ๋ฆฌ๋ธŒ์œ  (Extra Virgin Olive Oil)
    • ์ž๊ฐ€๋น„๊ฐ•์„ธ์ฒ™ (Nasal Irrigation)
    • QCY HT08 (MeloBuds Pro Plus)
  • Humanities

    • Nordvik, Russia
    • North Sentinel Island
    • ๋กฑ๊ณ ๋กฑ๊ณ (Rongorongo)
  • ๐Ÿ—‚๏ธ Database System

    • 1. Introduction
    • 2. Relational Model
    • 3. SQL
    • 6. E-R Model
    • 7. Relational Database Design (1)
    • 7. Relational Database Design (2)
    • 13. Data Storage Structures
    • 14. Indexing
    • 15. Query Processing
    • 16. Query Optimization
    • 17. Transactions
    • 18. Concurrency Control
    • 19. Recovery System

14. Indexing

  • Indexing: ์›ํ•˜๋Š” ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์ ‘๊ทผ ์†๋„๋ฅผ ๋†’์ด๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ๋ฉ”์ปค๋‹ˆ์ฆ˜

  • Search Key: ํŒŒ์ผ์—์„œ ๋ ˆ์ฝ”๋“œ๋ฅผ ์ฐพ๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์†์„ฑ ๋˜๋Š” ์†์„ฑ ์ง‘ํ•ฉ

  • ์ธ๋ฑ์Šค ํŒŒ์ผ์€ search-key์™€ pointer(s)(Blockย pointer,ย offset\text{Block pointer, offset}Blockย pointer,ย offset) ํ˜•ํƒœ์˜ ๋ ˆ์ฝ”๋“œ๋ผ ๋ถˆ๋ฆผ)๋กœ ๊ตฌ์„ฑ

    search-keypointer(s)
    • ์ธ๋ฑ์Šค ํŒŒ์ผ์€ ์ผ๋ฐ˜์ ์œผ๋กœ ์›๋ณธ ๋ฐ์ดํ„ฐ ํŒŒ์ผ๋ณด๋‹ค ํ›จ์”ฌ ์ž‘์Œ.
  • ๋‘ ๊ฐ€์ง€ ๊ธฐ๋ณธ ์ธ๋ฑ์Šค ์ข…๋ฅ˜

    • Ordered indices: Search key๊ฐ€ ์ •๋ ฌ๋œ ์ˆœ์„œ๋กœ ์ €์žฅ
    • Hash indices: Search key๊ฐ€ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ "buckets"์— ๊ท ์ผํ•˜๊ฒŒ ๋ถ„์‚ฐ
  • Index evaluation metrics

    • ํšจ์œจ์ ์œผ๋กœ ์ง€์›๋˜๋Š” access types
      • Point queries: Search key์— ๋Œ€ํ•ด ์ง€์ •๋œ ๊ฐ’์„ ๊ฐ–๋Š” ๋ ˆ์ฝ”๋“œ
      • Range queries: Search key๊ฐ’์ด ์ง€์ •๋œ ๋ฒ”์œ„ ๋‚ด์— ์žˆ๋Š” ๋ ˆ์ฝ”๋“œ
    • ๋ฐ์ดํ„ฐ ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ access/insertion/deletion times
    • ๊ณต๊ฐ„ overhead

Ordered Indices

  • Ordered index: ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ์— search key๊ฐ’์— ๋”ฐ๋ผ ์ •๋ ฌ๋˜์–ด ์ €์žฅ
  • Clustering index
    • Sequentially ordered data file์—์„œ, search key๊ฐ€ ๋ฐ์ดํ„ฐ ํŒŒ์ผ์˜ ์ˆœ์ฐจ์  ์ˆœ์„œ๋„ ์ •์˜ํ•˜๋Š” ์ธ๋ฑ์Šค
    • Primary index๋ผ๊ณ ๋„ ํ•จ.
    • ๊ธฐ๋ณธ ์ธ๋ฑ์Šค์˜ search key๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ primary key์ด์ง€๋งŒ, ํ•„์ˆ˜๋Š” ์•„๋‹˜
  • 2์ฐจ ์ธ๋ฑ์Šค
    • Search key๊ฐ€ ๋ฐ์ดํ„ฐ ํŒŒ์ผ์˜ ์ˆœ์ฐจ์  ์ˆœ์„œ์™€ ๋‹ค๋ฅธ ์ˆœ์„œ๋ฅผ ์ง€์ •ํ•˜๋Š” ์ธ๋ฑ์Šค
    • Nonclustering index๋ผ๊ณ ๋„ ํ•จ.
  • Index-sequential file
    • Search key๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ •๋ ฌ๋œ sequential data file์—, search key์— ๋Œ€ํ•œ clustering ์ธ๋ฑ์Šค๊ฐ€ ์žˆ๋Š” ํŒŒ์ผ

Dense Index

  • Dense index: ๋ฐ์ดํ„ฐ ํŒŒ์ผ์˜ ๋ชจ๋“  search-key๊ฐ’์— ๋Œ€ํ•ด index entry๊ฐ€ ๋‚˜ํƒ€๋‚จ

Sparse Index

  • Sparse Index: ์ผ๋ถ€ search-key๊ฐ’์— ๋Œ€ํ•ด์„œ๋งŒ ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๋ฅผ ํฌํ•จ.
    • ๋ ˆ์ฝ”๋“œ๊ฐ€ search key๋ฅผ ๊ธฐ์ค€์œผ๋กœ ์ˆœ์ฐจ์ ์œผ๋กœ ์ •๋ ฌ๋˜์–ด ์žˆ์„ ๋•Œ ์ ์šฉ ๊ฐ€๋Šฅ
  • Search-key๊ฐ’ KKK๋ฅผ ๊ฐ€์ง„ ๋ ˆ์ฝ”๋“œ๋ฅผ ์ฐพ์œผ๋ ค๋ฉด
    • KKK๋ณด๋‹ค ์ž‘๊ฑฐ๋‚˜ ๊ฐ™์€ search-key๊ฐ’ ์ค‘ ๊ฐ€์žฅ ํฐ ๊ฐ’์„ ๊ฐ€์ง„ index entry๋ฅผ ์ฐพ์Œ.
    • ํ•ด๋‹น index ๋ ˆ์ฝ”๋“œ๊ฐ€ ๊ฐ€๋ฆฌํ‚ค๋Š” ๋ ˆ์ฝ”๋“œ์—์„œ๋ถ€ํ„ฐ ํŒŒ์ผ ์ˆœ์ฐจ ๊ฒ€์ƒ‰
  • Dense indices์™€ ๋น„๊ต
    • ๋ฐ์ดํ„ฐ ๋ ˆ์ฝ”๋“œ์˜ ์‚ฝ์ž…/์‚ญ์ œ์— ๋Œ€ํ•ด less space and less maintenance overhead
    • ์ผ๋ฐ˜์ ์œผ๋กœ ๋ ˆ์ฝ”๋“œ๋ฅผ ์ฐพ๋Š” ๋ฐ dense index๋ณด๋‹ค ๋А๋ฆผ.
  • Access time๊ณผ space overhead ์‚ฌ์ด์˜ trade-off ์กด์žฌ
  • Good compromise
    • Clustering ์ธ๋ฑ์Šค์˜ ๊ฒฝ์šฐ: ํŒŒ์ผ์˜ ๋ชจ๋“  block์— ๋Œ€ํ•ด index entry๋ฅผ ๊ฐ€์ง„ sparse index. ํ•ด๋‹น block์—์„œ least search-key๊ฐ’์— ํ•ด๋‹น
      • ์ฐธ๊ณ : Query processing์˜ ์ฃผ์š” ๋น„์šฉ์€ block I/O time
      • ๋ฉ”๋ชจ๋ฆฌ ๋‚ด ๋ธ”๋ก scanning ์‹œ๊ฐ„์€ ์‚ฌ์†Œํ•จ.
      • ์ด ๋ฐฉ์‹์€ dense index์™€ ๋™์ผํ•œ ์ˆ˜์˜ block I/O๋ฅผ ๊ฐ€์ง€๋ฉด์„œ๋„ space overhead๊ฐ€ ํ›จ์”ฌ ์ ์Œ.

Multilevel Index

  • ์ธ๋ฑ์Šค๊ฐ€ ๋ฉ”๋ชจ๋ฆฌ์— ๋งž์ง€ ์•Š์œผ๋ฉด, access๊ฐ€ expensive
  • ํ•ด๊ฒฐ์ฑ…: Multilevel Index
    • ๋””์Šคํฌ์— ๋ณด๊ด€๋œ ์ธ๋ฑ์Šค๋ฅผ sequential file๋กœ ์ทจ๊ธ‰ํ•˜๊ณ  ๊ทธ ์œ„์— sparse ์ธ๋ฑ์Šค๋ฅผ ๊ตฌ์ถ•
    • Outer index: Basic ์ธ๋ฑ์Šค์˜ sparse index
    • Inner index: Basic ์ธ๋ฑ์Šค ํŒŒ์ผ
    • Outer index์กฐ์ฐจ ๋„ˆ๋ฌด ์ปค์„œ ๋ฉ”์ธ ๋ฉ”๋ชจ๋ฆฌ์— ๋งž์ง€ ์•Š์œผ๋ฉด, ๋˜ ๋‹ค๋ฅธ level์˜ index ์ƒ์„ฑ ๊ฐ€๋Šฅ
  • ๋ชจ๋“  level์˜ ์ธ๋ฑ์Šค๋Š” ํŒŒ์ผ์—์„œ์˜ ์‚ฝ์ž… ๋˜๋Š” ์‚ญ์ œ ์‹œ์— ์—…๋ฐ์ดํŠธ๋˜์–ด์•ผ ํ•จ.

Index Update: Insertion

  • Index๋Š” database modification์— overhead๋ฅผ ๋ถ€๊ณผ
    • ๋ ˆ์ฝ”๋“œ๊ฐ€ ์‚ฝ์ž… ๋˜๋Š” ์‚ญ์ œ๋  ๋•Œ, ๊ด€๊ณ„ํ˜•์˜ ๋ชจ๋“  ์ธ๋ฑ์Šค๊ฐ€ ๊ฐฑ์‹ ๋˜์–ด์•ผ ํ•จ.
    • ๋ ˆ์ฝ”๋“œ๊ฐ€ ๊ฐฑ์‹ ๋  ๋•Œ, ๊ฐฑ์‹ ๋œ ์†์„ฑ์— ๋Œ€ํ•œ ๋ชจ๋“  ์ธ๋ฑ์Šค๊ฐ€ ๊ฐฑ์‹ ๋˜์–ด์•ผ ํ•จ.
  • Index update upon insertion: 1) Dense indices
    • ์‚ฝ์ž…๋˜๋Š” ๋ ˆ์ฝ”๋“œ์˜ search-key๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ lookup ์ˆ˜ํ–‰
    • Search-key๊ฐ’์ด ์ธ๋ฑ์Šค์— ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์œผ๋ฉด, insert
    • Search-key๊ฐ’์ด ์ธ๋ฑ์Šค์— ๋‚˜ํƒ€๋‚˜๋ฉด
      • ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๊ฐ€ ๋™์ผํ•œ search-key๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ฒฝ์šฐ, ์ƒˆ๋กœ์šด ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋ฅผ ์ถ”๊ฐ€
      • ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ, ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๊ฐ€ ํ•ด๋‹น search-key๊ฐ’์„ ๊ฐ€์ง„ ์ฒซ ๋ฒˆ์งธ ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋งŒ ์ €์žฅ. ์ƒˆ๋กœ์šด ๋ ˆ์ฝ”๋“œ๋ฅผ ๋™์ผํ•œ search-key๊ฐ’์„ ๊ฐ€์ง„ ๋‹ค๋ฅธ ๋ ˆ์ฝ”๋“œ๋“ค ๋’ค์— ๋ฐฐ์น˜
    • ์ธ๋ฑ์Šค๋Š” sequential files๋กœ ์œ ์ง€ ๊ด€๋ฆฌ๋จ โ†’ ์ƒˆ๋กœ์šด ์—”ํŠธ๋ฆฌ๋ฅผ ์œ„ํ•œ space๋ฅผ ์ƒ์„ฑํ•ด์•ผ ํ•˜๋ฉฐ, overflow blocks์ด ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Œ.
  • Index update upon insertion: 2) Sparse indices
    • ์‚ฝ์ž…๋˜๋Š” ๋ ˆ์ฝ”๋“œ์˜ search-key๊ฐ’์„ ์‚ฌ์šฉํ•˜์—ฌ lookup ์ˆ˜ํ–‰
    • ์ธ๋ฑ์Šค๊ฐ€ ํŒŒ์ผ์˜ ๊ฐ ๋ธ”๋ก์— ๋Œ€ํ•œ ์—”ํŠธ๋ฆฌ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ฒฝ์šฐ, ์ธ๋ฑ์Šค๋ฅผ ์ˆ˜์ •ํ•  ํ•„์š”๊ฐ€ ์—†์Œ.
      • ์ƒˆ๋กœ์šด block์ด ์ƒ์„ฑ๋˜๋Š” ๊ฒฝ์šฐ: ์ƒˆ๋กœ์šด ๋ธ”๋ก์˜ ์ฒซ ๋ฒˆ์งธ search-key๊ฐ’์ด ์ธ๋ฑ์Šค์— ์‚ฝ์ž…๋จ.
      • ์ƒˆ๋กœ์šด ๋ ˆ์ฝ”๋“œ๊ฐ€ ๋ธ”๋ก์—์„œ ๊ฐ€์žฅ ์ž‘์€ search-key๊ฐ’์„ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ: ๋ธ”๋ก์„ ๊ฐ€๋ฆฌํ‚ค๋Š” ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๋ฅผ ์—…๋ฐ์ดํŠธ

Index Update: Deletion

  • Index update upon deletion: 1) Dense indices
    • (Case 1) ์‚ญ์ œ๋œ ๋ ˆ์ฝ”๋“œ๊ฐ€ ํ•ด๋‹น search-key๊ฐ’์„ ๊ฐ€์ง„ ํŒŒ์ผ ๋‚ด ์œ ์ผํ•œ ๋ ˆ์ฝ”๋“œ์ธ ๊ฒฝ์šฐ, ์ธ๋ฑ์Šค์—์„œ๋„ search-key๊ฐ€ ์‚ญ์ œ๋จ.
    • ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ
      • ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๊ฐ€ ๋™์ผํ•œ search-key๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ฒฝ์šฐ, ์‚ญ์ œ๋œ ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ ์‚ญ์ œ
      • ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ, ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๊ฐ€ ํ•ด๋‹น search-key๊ฐ’์„ ๊ฐ€์ง„ ์ฒซ ๋ฒˆ์งธ ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋งŒ ์ €์žฅ
        • (Case 2) ์‚ญ์ œ๋œ ๋ ˆ์ฝ”๋“œ๊ฐ€ ํ•ด๋‹น search-key๊ฐ’์„ ๊ฐ€์ง„ ์ฒซ ๋ฒˆ์งธ ๋ ˆ์ฝ”๋“œ์ธ ๊ฒฝ์šฐ, ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜์—ฌ ๋‹ค์Œ ๋ ˆ์ฝ”๋“œ๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋„๋ก ํ•จ.
        • (Case 3) ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ, ์ธ๋ฑ์Šค ๊ฐฑ์‹ ์€ required๋˜์ง€ ์•Š์Œ.
  • Index update upon deletion: 2) Sparse indices
    • (Case 1) ์ธ๋ฑ์Šค๊ฐ€ ์‚ญ์ œ๋œ ๋ ˆ์ฝ”๋“œ์˜ search-key๊ฐ’๊ณผ ์ผ์น˜ํ•˜๋Š” index entry๋ฅผ ํฌํ•จํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ, ์•„๋ฌด๊ฒƒ๋„ ํ•˜์ง€ ์•Š์Œ.
    • ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ
      • (Case 2) ์‚ญ์ œ๋œ ๋ ˆ์ฝ”๋“œ๊ฐ€ ํ•ด๋‹น search-key๋ฅผ ๊ฐ€์ง„ ์œ ์ผํ•œ ๋ ˆ์ฝ”๋“œ์ธ ๊ฒฝ์šฐ, index entry๋ฅผ ํŒŒ์ผ์—์„œ ๋‹ค์Œ search-key๊ฐ’์œผ๋กœ ๋Œ€์ฒด
        • ๋‹ค์Œ search-key๊ฐ’์ด ์ด๋ฏธ index entry๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค๋ฉด, ํ•ด๋‹น ์—”ํŠธ๋ฆฌ๋Š” ์‚ญ์ œ๋จ. -) ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ, search-key๊ฐ’์— ๋Œ€ํ•œ index entry๊ฐ€ ์‚ญ์ œ๋œ ๋ ˆ์ฝ”๋“œ๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” ๊ฒฝ์šฐ, ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๋ฅผ ์—…๋ฐ์ดํŠธํ•˜์—ฌ ๋™์ผํ•œ search-key๊ฐ’์„ ๊ฐ€์ง„ ๋‹ค์Œ ๋ ˆ์ฝ”๋“œ๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋„๋ก ํ•จ.

2์ฐจ ์ธ๋ฑ์Šค: An example

  • instructor ํŒŒ์ผ์˜ salary ํ•„๋“œ์— ๋Œ€ํ•œ 2์ฐจ ์ธ๋ฑ์Šค
  • ์ธ๋ฑ์Šค ๋ ˆ์ฝ”๋“œ๋Š” ํ•ด๋‹น ํŠน์ • search-key๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋“  ์‹ค์ œ ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋ฅผ ํฌํ•จํ•˜๋Š” bucket์„ ๊ฐ€๋ฆฌํ‚ด
  • 2์ฐจ ์ธ๋ฑ์Šค๋Š” denseํ•ด์•ผ ํ•จ.
    • Index ์—…๋ฐ์ดํŠธ ๊ณผ์ •์€ clustering ์ธ๋ฑ์Šค์˜ dense index ๊ฒฝ์šฐ์™€ ๋™์ผ
  • 2์ฐจ(nonclustering) ์ธ๋ฑ์Šค๋ฅผ ์‚ฌ์šฉํ•œ ์ˆœ์ฐจ์  ์Šค์บ” HDD์—์„œ expensive
    • ๊ฐ ๋ ˆ์ฝ”๋“œ ์ ‘๊ทผ ์‹œ ๋””์Šคํฌ์—์„œ ์ƒˆ๋กœ์šด block์„ ๊ฐ€์ ธ์™€์•ผ ํ•  ์ˆ˜ ์žˆ์Œ.

Indices on Multiple Keys

  • Composite search key
    • ๋‘ ๊ฐœ ์ด์ƒ์˜ ์†์„ฑ์„ ํฌํ•จํ•˜๋Š” search key
    • ์˜ˆ: instructor relation์˜ ์†์„ฑ (name, ID)์— ๋Œ€ํ•œ ์ธ๋ฑ์Šค
    • ๊ฐ’์€ ์‚ฌ์ „์ˆœ์œผ๋กœ(lexicographically) ์ •๋ ฌ๋จ.
      • E.g. (John,12121)<(John,13514)(\text{John}, 12121) < (\text{John}, 13514)(John,12121)<(John,13514) ๋ฐ (John,13514)<(Peter,11223)(\text{John}, 13514) <(\text{Peter}, 11223)(John,13514)<(Peter,11223)
  • name๋งŒ์œผ๋กœ queryํ•˜๊ฑฐ๋‚˜, (name, ID)๋กœ query ๊ฐ€๋Šฅ

B +^++-Tree(and B-Tree) Index Files

B +^++-Tree Index Files

  • Index-sequential file organization์˜ ๋‹จ์ 
    • ํŒŒ์ผ์ด ์ปค์ง์— ๋”ฐ๋ผ overflow blocks์ด ๋งŽ์ด ์ƒ์„ฑ๋˜์–ด ์„ฑ๋Šฅ ์ €ํ•˜
    • ๋น„์šฉ์ด ๋งŽ์ด ๋“ค๊ณ  ์ฃผ๊ธฐ์ ์ธ ์ „์ฒด ํŒŒ์ผ reorganization ํ•„์š”
  • B +^++-tree index ๊ตฌ์กฐ์˜ ์ด์ 
    • ์‚ฝ์ž…/์‚ญ์ œ ์‹œ ์ž‘๊ณ  localํ•œ ๋ณ€ํ™”๋งŒ์œผ๋กœ ์ž๋™์œผ๋กœ self-reorganizes
    • ์„ฑ๋Šฅ ์œ ์ง€๋ฅผ ์œ„ํ•ด ์ „์ฒด ํŒŒ์ผ์˜ reorganization์ด ํ•„์š”ํ•˜์ง€ ์•Š์Œ.
  • B +^++-trees์˜ (์‚ฌ์†Œํ•œ) ๋‹จ์ 
    • ์ถ”๊ฐ€์ ์ธ ์‚ฝ์ž…/์‚ญ์ œ overhead, ๊ณต๊ฐ„ overhead
  • B +^++-trees์˜ ์žฅ์ ์ด ๋‹จ์ ๋ณด๋‹ค ํฌ๊ธฐ ๋•Œ๋ฌธ์— ๋„๋ฆฌ ์‚ฌ์šฉ๋จ.
  • B +^++-tree๋Š” ๋‹ค์Œ ์†์„ฑ์„ ๋งŒ์กฑํ•˜๋Š” rooted tree
    • Root์—์„œ leaf๊นŒ์ง€์˜ ๋ชจ๋“  ๊ฒฝ๋กœ๋Š” ๊ธธ์ด๊ฐ€ ๊ฐ™์Œ: Balanced tree
    • Root๋‚˜ leaf๊ฐ€ ์•„๋‹Œ ๊ฐ node๋Š” โŒˆn/2โŒ‰\lceil n/2 \rceilโŒˆn/2โŒ‰์™€ nnn ์‚ฌ์ด์˜ children์„ ๊ฐ€์ง.
      • ์ตœ์†Œ โŒˆn/2โŒ‰\lceil n/2 \rceilโŒˆn/2โŒ‰, ์ตœ๋Œ€ nnn๊ฐœ์˜ children(pointers)
    • Leaf node๋Š” โŒˆ(nโˆ’1)/2โŒ‰\lceil(n-1)/2 \rceilโŒˆ(nโˆ’1)/2โŒ‰์™€ nโˆ’1n-1nโˆ’1 ์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ€์ง.
      • ์ตœ์†Œ โŒˆ(nโˆ’1)/2โŒ‰\lceil(n-1)/2 \rceilโŒˆ(nโˆ’1)/2โŒ‰, ์ตœ๋Œ€ nโˆ’1n-1nโˆ’1์˜ ๊ฐ’(not pointers)
    • ํŠน์ˆ˜ํ•œ ๊ฒฝ์šฐ๋“ค
      • Root๊ฐ€ leaf๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ, ์ตœ์†Œ 2๊ฐœ์˜ children์„ ๊ฐ€์ง.
      • Root๊ฐ€ leaf์ธ ๊ฒฝ์šฐ, 0๊ณผ (nโˆ’1)(n-1)(nโˆ’1) ์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ€์งˆ ์ˆ˜ ์žˆ์Œ.
  • B +^++-tree๋Š” multilevel ์ธ๋ฑ์Šค์ด์ง€๋งŒ, multilevel index-sequential ํŒŒ์ผ๊ณผ๋Š” ๋‹ค๋ฅธ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง.

B +^++-Tree Node Structure

  • Typical node
    P1P_1P1โ€‹K1K_1K1โ€‹P2P_2P2โ€‹...Pn+1P_{n+1}Pn+1โ€‹Kn+1K_{n + 1}Kn+1โ€‹PnP_nPnโ€‹
    • KiK_iKiโ€‹๋Š” search-key๊ฐ’
    • PiP_iPiโ€‹๋Š” children ๋˜๋Š” records/buckets of ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ
    • ์ฐธ๊ณ : ์ตœ๋Œ€ nnn๊ฐœ์˜ pointers์™€ nโˆ’1n-1nโˆ’1๊ฐœ์˜ key๊ฐ’์ด ์žˆ์„ ์ˆ˜ ์žˆ์Œ.
    • ๋…ธ๋“œ ๋‚ด์˜ search-keys๋Š” ์ •๋ ฌ๋จ.
      • K1<K2<K3<โ‹ฏ<Knโˆ’1K_1 < K_2 < K_3 < \dots < K_{n-1}K1โ€‹<K2โ€‹<K3โ€‹<โ‹ฏ<Knโˆ’1โ€‹
      • (์ฒ˜์Œ์—” ์ค‘๋ณต๋œ keys๊ฐ€ ์—†๋‹ค๊ณ  ๊ฐ€์ •, ์ดํ›„์— ์ค‘๋ณต์„ ๋‹ค๋ฃฐ ๊ฒƒ)

Leaf Nodes in B +^++-Trees

  • Leaf node์˜ ์†์„ฑ
    • i=1,ย 2,ย โ€ฆ,ย nโˆ’1i =1,~2,~\dots,~n-1i=1,ย 2,ย โ€ฆ,ย nโˆ’1์— ๋Œ€ํ•ด, ํฌ์ธํ„ฐ PiP_iPiโ€‹๋Š” search-key๊ฐ’ KiK_iKiโ€‹๋ฅผ ๊ฐ€์ง„ ํŒŒ์ผ ๋ ˆ์ฝ”๋“œ๋ฅผ ๊ฐ€๋ฆฌํ‚ด
    • LiL_iLiโ€‹์™€ LjL_jLjโ€‹๊ฐ€ leaf nodes์ด๊ณ  i<ji < ji<j์ธ ๊ฒฝ์šฐ, LiL_iLiโ€‹์˜ search-key๊ฐ’์€ LjL_jLjโ€‹์˜ search-key๊ฐ’๋ณด๋‹ค ์ž‘์Œ.
    • PnP_nPnโ€‹์€ search-key ์ˆœ์„œ๋กœ ๋‹ค์Œ leaf node๋ฅผ ๊ฐ€๋ฆฌํ‚ด
    • ์ˆœ์ฐจ ์ฒ˜๋ฆฌ๋ฅผ ์‹ ์†ํ•˜๊ฒŒ ํ•˜๊ธฐ ์œ„ํ•ด ๋ชจ๋“  leaf nodes๋ฅผ search-key ์ˆœ์„œ๋กœ chain together
    • B +^++-tree ์ธ๋ฑ์Šค๊ฐ€ dense index๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ฒฝ์šฐ(์ „ํ˜•์ ์ธ ๊ฒฝ์šฐ, ๋ชจ๋“  search-key๊ฐ’์ด ์ผ๋ถ€ leaf node์— ๋‚˜ํƒ€๋‚˜์•ผ ํ•จ. ๊ทธ๋Ÿฌ๋‚˜, non-leaf node์— ๋‚˜ํƒ€๋‚˜๋Š” search-key๋Š” ๋ ˆ์ฝ”๋“œ ์‚ญ์ œ๋กœ ์ธํ•ด leaf node์— ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์„ ์ˆ˜ ์žˆ์Œ.

Non-Leaf Nodes in B +^++-Trees

  • Non-leaf nodes๋Š” leaf nodes์— ๋Œ€ํ•œ multi-level sparse ์ธ๋ฑ์Šค๋ฅผ ํ˜•์„ฑ
  • nnn๊ฐœ์˜ ํฌ์ธํ„ฐ๋ฅผ ๊ฐ€์ง„ non-leaf node์˜ ๊ฒฝ์šฐ
    • P1P_1P1โ€‹์ด ๊ฐ€๋ฆฌํ‚ค๋Š” subtree์˜ ๋ชจ๋“  search-keys๋Š” K1K_1K1โ€‹๋ณด๋‹ค ์ž‘์Œ.
    • 2โ‰คiโ‰คnโˆ’12 \leq i \leq n-12โ‰คiโ‰คnโˆ’1์— ๋Œ€ํ•ด, PiP_iPiโ€‹๊ฐ€ ๊ฐ€๋ฆฌํ‚ค๋Š” subtree์˜ ๋ชจ๋“  search-keys๋Š” Kiโˆ’1K_{i-1}Kiโˆ’1โ€‹๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™๊ณ  KiK_iKiโ€‹๋ณด๋‹ค ์ž‘์Œ.
    • PnP_nPnโ€‹์ด ๊ฐ€๋ฆฌํ‚ค๋Š” subtree์˜ ๋ชจ๋“  search-keys๋Š” Knโˆ’1K_{n-1}Knโˆ’1โ€‹๋ณด๋‹ค ํฌ๊ฑฐ๋‚˜ ๊ฐ™์Œ.
  • ์ฐธ๊ณ : Non-leaf nodes๋Š” ๊ทธ๋“ค ์‚ฌ์ด์— ์ค‘๋ณต๋œ search-key๊ฐ’์„ ๊ฐ€์ง€์ง€ ์•Š์Œ.
  • ์ผ๋ฐ˜์ ์ธ ๊ตฌ์กฐ

Example of B +^++-Tree index for instructor (n=4n=4n=4, search key: name)

Example of B +^++-Tree index for instructor (n=6n=6n=6)

  • Root๋Š” ์ตœ์†Œ 2๊ฐœ์˜ children์„ ๊ฐ€์ ธ์•ผ ํ•จ.
  • Interal nodes๋Š” 3์—์„œ 6 ์‚ฌ์ด์˜ children์„ ๊ฐ€์ ธ์•ผ ํ•จ. (โŒˆn/2โŒ‰\lceil n / 2 \rceilโŒˆn/2โŒ‰๊ณผ nnn ์‚ฌ์ด)
  • Leaf nodes๋Š” 3๊ณผ 5 ์‚ฌ์ด์˜ ๊ฐ’์„ ๊ฐ€์ ธ์•ผ ํ•จ. (โŒˆ(nโˆ’1)/2โŒ‰\lceil (n-1) / 2 \rceilโŒˆ(nโˆ’1)/2โŒ‰๊ณผ nโˆ’1n-1nโˆ’1 ์‚ฌ์ด)
  • ํŠธ๋ฆฌ์˜ ๋†’์ด๊ฐ€ ์ด์ „(n=4n=4n=4)์— ๋น„ํ•ด ๋‚ฎ์•„์กŒ์Œ์— ์ฃผ๋ชฉํ•  ๊ฒƒ

Observations about B +^++-trees

  • Node ๊ฐ„์˜ ์—ฐ๊ฒฐ์ด ํฌ์ธํ„ฐ๋กœ ์ด๋ฃจ์–ด์ง€๊ธฐ ๋•Œ๋ฌธ์—, "๋…ผ๋ฆฌ์ ์œผ๋กœ" ๊ฐ€๊นŒ์šด blocks์ด "๋ฌผ๋ฆฌ์ ์œผ๋กœ" ๊ฐ€๊นŒ์šธ ํ•„์š”๋Š” ์—†์Œ.
  • B +^++-tree์˜ non-leaf levels์€ hierarchy of sparse ์ธ๋ฑ์Šค๋ฅผ ํ˜•์„ฑ
  • B +^++-tree๋Š” ์ƒ๋Œ€์ ์œผ๋กœ ์ž‘์€ ์ˆ˜์˜ ๊ณ„์ธต์„ ํฌํ•จ.
    • Root ์•„๋ž˜ ๋ ˆ๋ฒจ์€ ์ตœ์†Œ 2โ‹…โŒˆn/2โŒ‰2 \cdot \lceil n/2 \rceil2โ‹…โŒˆn/2โŒ‰์˜ ๊ฐ’
    • ๋‹ค์Œ ๋ ˆ๋ฒจ์€ ์ตœ์†Œ 2โ‹…โŒˆn/2โŒ‰โ‹…โŒˆn/2โŒ‰2 \cdot \lceil n/2 \rceil \cdot \lceil n/2 \rceil2โ‹…โŒˆn/2โŒ‰โ‹…โŒˆn/2โŒ‰์˜ ๊ฐ’
  • ํŒŒ์ผ์— KKK๊ฐœ์˜ search-key๊ฐ’์ด ์žˆ๋Š” ๊ฒฝ์šฐ, tree ๋†’์ด๋Š” โŒˆlogโกโŒˆn/2โŒ‰(K)โŒ‰\lceil \log_{\lceil n/2 \rceil}(K) \rceilโŒˆlogโŒˆn/2โŒ‰โ€‹(K)โŒ‰๋ฅผ ์ดˆ๊ณผํ•˜์ง€ ์•Š์Œ.
  • ๋”ฐ๋ผ์„œ search๊ฐ€ ํšจ์œจ์ ์œผ๋กœ ์ˆ˜ํ–‰๋  ์ˆ˜ ์žˆ์Œ.
  • ์ธ๋ฑ์Šค๊ฐ€ logarithmic time์œผ๋กœ ์žฌ๊ตฌ์„ฑ๋  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ, main file์— ๋Œ€ํ•œ ์‚ฝ์ž…/์‚ญ์ œ ๋˜ํ•œ ํšจ์œจ์ ์œผ๋กœ ์ฒ˜๋ฆฌ๋  ์ˆ˜ ์žˆ์Œ.

Queries on B +^++-Trees: Point Query

function find(v)
1. Set C = root node
2. while(C is not a leaf node) begin
   Let i = smallest number s.t. v โ‰ค C.Ki
   if there is no such number i then
      /* v is larger than every key in C */
      Set C = the node pointed by the last non-null ํฌ์ธํ„ฐ in C
   else if(v = C.Ki ) Set C = C.Pi +1
   else set C = C.Pi /* v < C.Ki */
   end
   /* Now, C is a leaf node */
3. if for some i, Ki = v then return C.Pi
4. else return null /* no record with search-key๊ฐ’ v exists */

Queries on B +^++-Trees: Range Query

  • Range queries: ์ฃผ์–ด์ง„ ๋ฒ”์œ„ ๋‚ด์˜ search key๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ๋ฅผ ์ฐพ์Œ.
  • function findRange(lb, ub)๋Š” lbโ‰คVโ‰คublb \le V \le ublbโ‰คVโ‰คub์ธ search key๊ฐ’ VVV๋ฅผ ๊ฐ€์ง„ ๋ชจ๋“  ๋ ˆ์ฝ”๋“œ ์ง‘ํ•ฉ์„ ๋ฐ˜ํ™˜
    1. $C=lb $๊ฐ€ ๋‚˜ํƒ€๋‚  leaf node๋ฅผ ์ฐพ์Œ( find(v)์—์„œ CCC๋ฅผ ์ฐพ๋Š” ๊ฒƒ๊ณผ ๋™์ผ).
    2. iย =ย Ci~=~Ciย =ย C ์—์„œ Kiโ‰ฅlbK_i \ge lbKiโ€‹โ‰ฅlb์ธ ์ตœ์†Ÿ๊ฐ’
    3. while(Kiโ‰คubK_i \le ubKiโ€‹โ‰คub)
    • C.PiC.P_iC.Piโ€‹๋ฅผ results์— add
    • i=i+1i=i+1i=i+1 (CCC์— ๋ ˆ์ฝ”๋“œ๊ฐ€ ๋” ์žˆ๋‹ค๋ฉด) ๋˜๋Š” i=1i = 1i=1ํ•˜๋ฉฐ ๋‹ค์Œ leaf node๋กœ ์ด๋™
  • ์‹ค์ œ ๊ตฌํ˜„์€ ์ผ๋ฐ˜์ ์œผ๋กœ next() ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ผ์น˜ํ•˜๋Š” ๋ ˆ์ฝ”๋“œ๋ฅผ ํ•œ ๋ฒˆ์— ๊ฐ€์ ธ์˜ค๋Š” iterator ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ œ๊ณต

Queries on B +^++-Trees: Cost Analysis

  • ํŒŒ์ผ์— KKK๊ฐœ์˜ search-key๊ฐ’์ด ์žˆ๋Š” ๊ฒฝ์šฐ, ํŠธ๋ฆฌ์˜ ๋†’์ด๋Š” โŒˆlogโกโŒˆn/2โŒ‰(K)โŒ‰\lceil \log_{\lceil n/2 \rceil}(K) \rceilโŒˆlogโŒˆn/2โŒ‰โ€‹(K)โŒ‰๋ฅผ ์ดˆ๊ณผํ•˜์ง€ ์•Š์Œ.
  • Node๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ disk block๊ณผ ๊ฐ™์€ ํฌ๊ธฐ, ์ผ๋ฐ˜์ ์œผ๋กœ 4KB
    • nnn์€ ์ผ๋ฐ˜์ ์œผ๋กœ ์•ฝ 100
    • Search-key size๊ฐ€ 12๋ฐ”์ดํŠธ์ธ ๊ฒฝ์šฐ, nnn์€ ์•ฝ 200
  • 100๋งŒ๊ฐœ์˜ search key๊ฐ’ ๋ฐ n=100n=100n=100์ธ ๊ฒฝ์šฐ
    • Root์—์„œ leaf๊นŒ์ง€์˜ index lookup์— ๋Œ€ํ•ด ์ตœ๋Œ€ โŒˆlogโก50(1,000,000)โŒ‰=4\lceil \log_{50}(1,000,000) \rceil = 4โŒˆlog50โ€‹(1,000,000)โŒ‰=4 nodes ์ ‘๊ทผ๋จ.
    • 100๋งŒ๊ฐœ์˜ search key๊ฐ’์„ ๊ฐ€์ง„ balanced binary tree์™€ ๋น„๊ต: ์กฐํšŒ ์‹œ ์•ฝ 20 nodes ์ ‘๊ทผ๋จ.
    • ๋ชจ๋“  node access๋Š” disk I/O๋ฅผ ํ•„์š”๋กœ ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์•ฝ 20ms์˜ ๋น„์šฉ์ด ๋“ค๊ธฐ ๋•Œ๋ฌธ์— ์œ„์˜ ์ฐจ์ด๋Š” ์œ ์˜๋ฏธํ•จ.
  • ์ธ๋ฑ์Šค๋ฅผ traverseํ•œ ํ›„, ์ผ์น˜ํ•˜๋Š” ๋ ˆ์ฝ”๋“œ๋ฅผ fetchํ•˜๊ธฐ ์œ„ํ•ด ํ•œ ๋ฒˆ์˜ random I/O๊ฐ€ ๋” ํ•„์š”

Non-Unique Keys

  • Search key aia_iaiโ€‹๊ฐ€ not unique, ๋Œ€์‹  uniqueํ•œ composite key (ai,Ap)(a_i, A_p)(aiโ€‹,Apโ€‹)์— ๋Œ€ํ•œ ์ธ๋ฑ์Šค๋ฅผ ์ƒ์„ฑ
    • ApA_pApโ€‹๋Š” primary key, record ID ๋˜๋Š” uniqueness๋ฅผ ๋ณด์žฅํ•˜๋Š” ๊ธฐํƒ€ ์†์„ฑ์ผ ์ˆ˜ ์žˆ์Œ.
  • ai=va_i = vaiโ€‹=v์— ๋Œ€ํ•œ search๋Š” composite key์— ๋Œ€ํ•œ range search์œผ๋กœ ๊ตฌํ˜„ ๊ฐ€๋Šฅ
    • Range (v,โˆ’โˆž)(v, -\infty)(v,โˆ’โˆž) to (v,+โˆž)(v, +\infty)(v,+โˆž)
  • ๊ทธ๋Ÿฌ๋‚˜ ์ถ”๊ฐ€์ ์ธ I/O ์—ฐ์‚ฐ์ด ์‹ค์ œ ๋ ˆ์ฝ”๋“œ๋ฅผ fetchํ•˜๋Š” ๋ฐ ํ•„์š”
    • ์ธ๋ฑ์Šค๊ฐ€ clustering์ธ ๊ฒฝ์šฐ, ๋ชจ๋“  access๋Š” sequential
    • ์ธ๋ฑ์Šค๊ฐ€ non-clustering์ธ ๊ฒฝ์šฐ, ๊ฐ ๋ ˆ์ฝ”๋“œ access๋Š” I/O ์—ฐ์‚ฐ์„ ํ•„์š”๋กœ ํ•  ์ˆ˜ ์žˆ์Œ.

Updates on B +^++-Trees: Insertion

  • ๋ ˆ์ฝ”๋“œ๊ฐ€ ๋ฐ์ดํ„ฐ ํŒŒ์ผ์— ์ด๋ฏธ ์ถ”๊ฐ€๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •
    • PrP_rPrโ€‹ ๋ฐ vvv๋Š” ๊ฐ๊ฐ record์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ ๋ฐ search key๊ฐ’
  • Search-key๊ฐ’์ด ๋‚˜ํƒ€๋‚  leaf node LLL์„ ํƒ์ƒ‰
    1. LLL์— room์ด ์žˆ๋Š” ๊ฒฝ์šฐ, (v,ย Pr)(v,~P_r)(v,ย Prโ€‹) ์Œ์„ LLL์— ์‚ฝ์ž…
    2. ๊ทธ๋ ‡์ง€ ์•Š์€ ๊ฒฝ์šฐ, node๋ฅผ split ์—”ํŠธ๋ฆฌ๋ฅผ ํฌํ•จํ•˜์—ฌ
  • Splitting a leaf node LLL
    • ์ •๋ ฌ๋œ ์ˆœ์„œ๋กœ nnn๊ฐœ์˜(search-key, pointer) ์Œ์„ ์ทจํ•จ. ์ฒซ ๋ฒˆ์งธ โŒˆn/2โŒ‰\lceil n/2 \rceilโŒˆn/2โŒ‰๋ฅผ ๊ธฐ์กด์˜ node (L)(L)(L)์— ๋ฐฐ์น˜ํ•˜๊ณ , ๋‚˜๋จธ์ง€๋ฅผ ์ƒˆ node (Lโ€ฒ)(L')(Lโ€ฒ)์— ๋ฐฐ์น˜
    • kkk๋ฅผ Lโ€ฒL'Lโ€ฒ์˜ least key๊ฐ’์ด๋ผ๊ณ  ํ•จ. (k,Lโ€ฒ)(k, L')(k,Lโ€ฒ)๋ฅผ split๋˜๋Š” node์˜ parent์— ์‚ฝ์ž…
    • Parent๊ฐ€ ๊ฐ€๋“ ์ฐผ๋‹ค๋ฉด, parent๋ฅผ ๋ถ„ํ• ํ•˜๊ณ  ๋ถ„ํ• ์„ ๋” ์œ„๋กœ propagate
  • Node ๋ถ„ํ• ์€ full์ด ์•„๋‹Œ node๋ฅผ ์ฐพ์„ ๋•Œ๊นŒ์ง€ ์œ„๋กœ ์ง„ํ–‰
    • ์ตœ์•…์˜ ๊ฒฝ์šฐ root node๊ฐ€ split๋˜์–ด ํŠธ๋ฆฌ์˜ ๋†’์ด๊ฐ€ 1 ์ฆ๊ฐ€ํ•  ์ˆ˜ ์žˆ์Œ.
  • Splitting a non-leaf node: ์ด๋ฏธ full์ธ internal node NNN์— (k,Lโ€ฒ)(k, L')(k,Lโ€ฒ)๋ฅผ ์‚ฝ์ž…ํ•  ๋•Œ
    • NNN์„ n+1n+1n+1 pointers์™€ nnn keys๋ฅผ ์œ„ํ•œ ๊ณต๊ฐ„์ด ์žˆ๋Š” in-memory area MMM์œผ๋กœ Copy
    • (k,ย Lโ€ฒ)(k,~L')(k,ย Lโ€ฒ)๋ฅผ MMM์— ์‚ฝ์ž…
    • MMM์—์„œ P1,K1,โ€ฆ,KโŒˆ(n+1)/2โŒ‰โˆ’1,PโŒˆ(n+1)/2โŒ‰P_1, K_1, \dots, K_{\lceil(n+1)/2 \rceil-1}, P_{\lceil(n+1)/2 \rceil}P1โ€‹,K1โ€‹,โ€ฆ,KโŒˆ(n+1)/2โŒ‰โˆ’1โ€‹,PโŒˆ(n+1)/2โŒ‰โ€‹๋ฅผ ๋‹ค์‹œ node NNN์œผ๋กœ Copy
    • MMM์—์„œ PโŒˆ(n+1)/2โŒ‰+1,KโŒˆ(n+1)/2โŒ‰+1,โ€ฆ,Kn,Pn+1P_{\lceil(n+1)/2 \rceil+1}, K_{\lceil(n+1)/2 \rceil+1}, \dots, K_n, P_{n+1}PโŒˆ(n+1)/2โŒ‰+1โ€‹,KโŒˆ(n+1)/2โŒ‰+1โ€‹,โ€ฆ,Knโ€‹,Pn+1โ€‹๋ฅผ ์ƒˆ๋กœ ํ• ๋‹น๋œ node Nโ€ฒN'Nโ€ฒ์œผ๋กœ Copy
    • (KโŒˆ(n+1)/2โŒ‰,Nโ€ฒ)(K_{\lceil(n+1)/2 \rceil}, N')(KโŒˆ(n+1)/2โŒ‰โ€‹,Nโ€ฒ)๋ฅผ NNN์˜ parent์— ์‚ฝ์ž…
    • ์ฐธ๊ณ : leaf node๋ฅผ ๋ถ„ํ• ํ•˜๋Š” ๊ฒƒ๊ณผ๋Š” ๋‹ฌ๋ฆฌ, search-key๋Š” 'copied'๋˜์ง€ ์•Š๊ณ  parent node๋กœ 'moved'๋จ(์ฆ‰, no duplication!)

Updates on B +^++-Trees: Deletion

  • ๋ ˆ์ฝ”๋“œ๊ฐ€ file์—์„œ ์ด๋ฏธ ์‚ญ์ œ๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •. vvv๋Š” record์˜ search key๊ฐ’์ด๊ณ , PrP_rPrโ€‹์€ record์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ
  • (Pr,v)(P_r, v)(Prโ€‹,v)๋ฅผ leaf node์—์„œ ์ œ๊ฑฐ
  • Leaf node๊ฐ€ ์ œ๊ฑฐ๋กœ ์ธํ•ด too few entries(<โŒˆ(nโˆ’1)/2โŒ‰< \lceil(n-1)/2 \rceil<โŒˆ(nโˆ’1)/2โŒ‰)๋ฅผ ๊ฐ€์ง€๊ณ , node์˜ ์—”ํŠธ๋ฆฌ์™€ sibling์˜ ์—”ํŠธ๋ฆฌ์— ๋‹จ์ผ node(โ‰คnโˆ’1\le n-1โ‰คnโˆ’1)์— fitํ•œ๋‹ค๋ฉด siblings๋ฅผ ๋ณ‘ํ•ฉ
    • ๋‘ node์˜ ๋ชจ๋“  ์—”ํŠธ๋ฆฌ๋ฅผ left node์— ์‚ฝ์ž…ํ•˜๊ณ  ๋‹ค๋ฅธ node๋ฅผ ์‚ญ์ œ
    • ์‚ญ์ œ๋œ node๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” ํฌ์ธํ„ฐ๊ฐ€ PiP_iPiโ€‹์ธ ๊ฒฝ์šฐ, ์Œ (Kiโˆ’1,Pi)(K_{i-1}, P_i)(Kiโˆ’1โ€‹,Piโ€‹)๋ฅผ ๊ทธ parent๋กœ๋ถ€ํ„ฐ ์žฌ๊ท€์ ์œผ๋กœ ์‚ญ์ œ
  • ๊ทธ๋ ‡์ง€ ์•Š๋‹ค๋ฉด, node๊ฐ€ ์ œ๊ฑฐ๋กœ ์ธํ•ด ๋„ˆ๋ฌด ์ ์€ ์—”ํŠธ๋ฆฌ๋ฅผ ๊ฐ€์ง€๊ณ , node์˜ ์—”ํŠธ๋ฆฌ์™€ sibling์˜ ์—”ํŠธ๋ฆฌ์— ๋‹จ์ผ node์— fitํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ, ํฌ์ธํ„ฐ๋ฅผ ์žฌ๋ถ„๋ฐฐ
    • Node์™€ sibling ์‚ฌ์ด์— ํฌ์ธํ„ฐ๋ฅผ ์žฌ๋ถ„๋ฐฐํ•˜์—ฌ ๋‘˜ ๋‹ค minimum number of entries(โ‰ฅโŒˆ(nโˆ’1)/2โŒ‰\ge \lceil(n-1)/2 \rceilโ‰ฅโŒˆ(nโˆ’1)/2โŒ‰)๋ณด๋‹ค ๋” ๊ฐ€์ง€๋„๋ก ํ•จ.
    • Node์˜ parent์—์„œ ์ผ์น˜ํ•˜๋Š” search-key๊ฐ’์„ ์—…๋ฐ์ดํŠธ
  • Node ์‚ญ์ œ๋Š” ์‚ญ์ œ ํ›„ โŒˆn/2โŒ‰\lceil n/2 \rceilโŒˆn/2โŒ‰๊ฐœ ์ด์ƒ์˜ ํฌ์ธํ„ฐ๋ฅผ ๊ฐ€์ง„ node๋ฅผ ์ฐพ์„ ๋•Œ๊นŒ์ง€ cascade upwards๋  ์ˆ˜ ์žˆ์Œ.
  • Root node๊ฐ€ ์‚ญ์ œ ํ›„ ๋‹จ ํ•˜๋‚˜์˜ ํฌ์ธํ„ฐ๋งŒ ๊ฐ€์ง€๋Š” ๊ฒฝ์šฐ, ์‚ญ์ œ๋˜๊ณ  ์œ ์ผํ•œ child๊ฐ€ root๊ฐ€ ๋จ.

Complexity of B +^++-Tree Updates

  • ๋‹จ์ผ ์—”ํŠธ๋ฆฌ์˜ ์‚ฝ์ž…/์‚ญ์ œ ๋น„์šฉ์€ ํŠธ๋ฆฌ์˜ ๋†’์ด์— ๋น„๋ก€
  • KKK๊ฐœ์˜ ์—”ํŠธ๋ฆฌ์™€ ์ตœ๋Œ€ fanout nnn์ด ์žˆ๋Š” ๊ฒฝ์šฐ, ์—”ํŠธ๋ฆฌ์˜ ์‚ฝ์ž…/์‚ญ์ œ์˜ worst-case ๋ณต์žก๋„๋Š” O(logโกโŒˆn/2โŒ‰(K))O(\log_{\lceil n/2 \rceil}(K))O(logโŒˆn/2โŒ‰โ€‹(K))
  • ์‹ค์ œ๋กœ I/O ์ž‘์—…์˜ ์ˆ˜๋Š” ๋” ์ ์Œ.
    • Internal nodes๋Š” buffer์— ์žˆ๋Š” ๊ฒฝํ–ฅ์ด ์žˆ์Œ.
    • Splits/merges๋Š” ๋“œ๋ฌผ๊ณ , ๋Œ€๋ถ€๋ถ„์˜ ์‚ฝ์ž…/์‚ญ์ œ ์ž‘์—…์€ leaf node์—๋งŒ ์˜ํ–ฅ์„ ์คŒ.
  • ํ‰๊ท ์ ์ธ node occupancy๋Š” ์‚ฝ์ž… ์ˆœ์„œ์— ์˜์กด
    • Random insertion ์‹œ โ‰ฅ2/3\ge 2/3โ‰ฅ2/3
    • Sorted ์ˆœ์„œ๋กœ ์‚ฝ์ž… ์‹œ 1/21/21/2

B +^++-Tree File Organization

  • B +^++-Tree 'File' Organization
    • B +^++-tree file organization์˜ Leaf nodes๋Š” ํฌ์ธํ„ฐ ๋Œ€์‹  ๋ ˆ์ฝ”๋“œ๋ฅผ ์ €์žฅ
    • Insertion/deletion/updates๊ฐ€ ์žˆ์„ ๋•Œ๋„ ๋ฐ์ดํ„ฐ ๋ ˆ์ฝ”๋“œ๋ฅผ ํด๋Ÿฌ์Šคํ„ฐ๋œ ์ƒํƒœ๋กœ ์œ ์ง€ํ•˜๋Š” ๋ฐ ๋„์›€ โ†’ ๋ฐ์ดํ„ฐ ํŒŒ์ผ degradation ๋ฌธ์ œ(overflow blocks) ํ•ด๊ฒฐ
    • Leaf nodes๋Š” ์—ฌ์ „ํžˆ half full ์ƒํƒœ๊ฐ€ ์š”๊ตฌ๋จ.
    • ๋ ˆ์ฝ”๋“œ๊ฐ€ ํฌ์ธํ„ฐ๋ณด๋‹ค ํฌ๊ธฐ ๋•Œ๋ฌธ์—, leaf node์— ์ €์žฅํ•  ์ˆ˜ ์žˆ๋Š” ๋ ˆ์ฝ”๋“œ์˜ ์ตœ๋Œ€ ์ˆ˜๋Š” non-leaf node์˜ ํฌ์ธํ„ฐ ์ˆ˜๋ณด๋‹ค ์ ์Œ.
    • ์ข‹์€ ๊ณต๊ฐ„ utilization์ด ์ค‘์š”
      • ํ•ฉ๋ณ‘/๋ถ„ํ•  ๋™์•ˆ redistribution์— ์ถ”๊ฐ€์ ์ธ sibling nodes๋ฅผ ํฌํ•จ์‹œํ‚ด
      • Redistribution์— 2 siblings๋ฅผ ํฌํ•จ์‹œํ‚ค๋ฉด, ๊ฐ node๋Š” ์ตœ์†Œ โŒŠ3n/4โŒ‹\lfloor 3n/4 \rfloorโŒŠ3n/4โŒ‹์˜ ์—”ํŠธ๋ฆฌ๋ฅผ ๊ฐ€์ง€๊ฒŒ ๋จ.
    • ์‚ฝ์ž…/์‚ญ์ œ๋Š” B +^++-tree index์—์„œ ์—”ํŠธ๋ฆฌ์˜ ์‚ฝ์ž…/์‚ญ์ œ์™€ ๋™์ผํ•œ ๋ฐฉ์‹์œผ๋กœ ์ฒ˜๋ฆฌ

B-Tree Index Files

  • B +^++-tree์™€ ์œ ์‚ฌํ•˜์ง€๋งŒ, B-tree๋Š” search-key๊ฐ’์ด ๋‹จ ํ•œ ๋ฒˆ๋งŒ ๋‚˜ํƒ€๋‚˜๋„๋ก ํ—ˆ์šฉ.
    • Search keys์˜ redundant storage๋ฅผ ์ œ๊ฑฐ
  • Non-leaf nodes์˜ search keys๋Š” B-tree์˜ ์–ด๋””์—๋„ ๋‚˜ํƒ€๋‚˜์ง€ ์•Š์Œ.
    • Non-leaf node์˜ ๊ฐ search key์— ๋Œ€ํ•ด ์ถ”๊ฐ€์ ์ธ ํฌ์ธํ„ฐ field๊ฐ€ ํฌํ•จ๋˜์–ด์•ผ ํ•จ.
  • Generalized B-tree leaf node
  • Non-leaf node: ํฌ์ธํ„ฐ BiB_iBiโ€‹๋Š” bucket ๋˜๋Š” ํŒŒ์ผ ๋ ˆ์ฝ”๋“œ ํฌ์ธํ„ฐ
  • B-Tree ์ธ๋ฑ์Šค์˜ ์ด์ 
    • ์ผ์น˜ํ•˜๋Š” B +^++-Tree๋ณด๋‹ค ๋” ์ ์€ tree nodes๊ฐ€ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Œ.
    • ๋•Œ๋•Œ๋กœ leaf node์— ๋„๋‹ฌํ•˜๊ธฐ ์ „์— search-key๊ฐ’์„ ๋ฐœ๊ฒฌ.
  • B-Tree indices์˜ ๋‹จ์ 
    • Small fraction์˜ search-key๊ฐ’๋งŒ ์ผ์ฐ ๋ฐœ๊ฒฌ๋จ.
    • Non-leaf nodes๋Š” ์ฆ๊ฐ€, fan-out์ด ๊ฐ์†Œ โ†’ B-Trees๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์ผ์น˜ํ•˜๋Š” B +^++-Tree๋ณด๋‹ค ํฐ ๊นŠ์ด๋ฅผ ๊ฐ€์ง.
    • ์‚ฝ์ž…/์‚ญ์ œ๊ฐ€ B +^++-Trees๋ณด๋‹ค ๋” ๋ณต์žก
    • ๊ตฌํ˜„์ด B +^++-Trees๋ณด๋‹ค ์–ด๋ ค์›€.
    • ์ผ๋ฐ˜์ ์œผ๋กœ, B-Trees์˜ ์žฅ์ ์ด ๋‹จ์ ์„ ๋›ฐ์–ด๋„˜์ง€ ๋ชปํ•จ.

Other Issues in Indexing

  • Record relocation and 2์ฐจ ์ธ๋ฑ์Šค
    • ๋ ˆ์ฝ”๋“œ๊ฐ€ ์ด๋™ํ•˜๋ฉด, ๋ ˆ์ฝ”๋“œ ํฌ์ธํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๋ชจ๋“  2์ฐจ ์ธ๋ฑ์Šค๊ฐ€ ์—…๋ฐ์ดํŠธ๋˜์–ด์•ผ ํ•จ.
    • โ†’ B +^++-tree file organizations์—์„œ leaf node splits๊ฐ€ ๋งค์šฐ expensiveํ•จ.
    • ํ•ด๊ฒฐ์ฑ…: 2์ฐจ ์ธ๋ฑ์Šค์—์„œ ๋ ˆ์ฝ”๋“œ ํฌ์ธํ„ฐ ๋Œ€์‹  B +^++-tree file organization์˜ search key๋ฅผ ์‚ฌ์šฉ
      • ๋ ˆ์ฝ”๋“œ๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด file organization์˜ ์ถ”๊ฐ€์ ์ธ ํƒ์ƒ‰
      • Queries์— ๋Œ€ํ•œ ๋†’์€ ๋น„์šฉ, ๊ทธ๋Ÿฌ๋‚˜ node ๋ถ„ํ• ์˜ ๋น„์šฉ์ด ์ €๋ ด
  • Indexing strings
    • Variable length strings as keys
      • Variable fanout
      • Pointers์˜ ์ˆ˜๊ฐ€ ์•„๋‹Œ space utilization์„ splitting์˜ ๊ธฐ์ค€์œผ๋กœ ์‚ฌ์šฉ
      • Prefix ์••์ถ•
        • Internal nodes์˜ key๊ฐ’์€ prefixes of full key์ผ ์ˆ˜ ์žˆ์Œ โ†’ fanout ์ฆ๊ฐ€
        • Key๊ฐ’์œผ๋กœ ๋ถ„๋ฆฌ๋œ subtrees์˜ ์—”ํŠธ๋ฆฌ๋ฅผ ๊ตฌ๋ถ„ํ•˜๊ธฐ์— ์ถฉ๋ถ„ํ•œ ๋ฌธ์ž๋ฅผ ์œ ์ง€
          • ์˜ˆ: "Silas"์™€ "Silberschatz"๋Š” "Silb"๋กœ ๋ถ„๋ฆฌ๋  ์ˆ˜ ์žˆ์Œ.

Other Issues in Indexing: Bulk Loading & Bottom-Up Build

  • ์—”ํŠธ๋ฆฌ๋ฅผ ํ•œ ๋ฒˆ์— B +^++-tree์— ์‚ฝ์ž…ํ•˜๋ฉด ์—”ํŠธ๋ฆฌ ๋‹น โ‰ฅ1\ge 1โ‰ฅ1์˜ I/O๊ฐ€ ํ•„์š”ํ•  ์ˆ˜ ์žˆ์Œ.
    • ์ตœ์•…์˜ ๊ฒฝ์šฐ, ํŠธ๋ฆฌ์˜ ๋†’์ด์— ๋น„๋ก€
    • ๋งŽ์€ ์ˆ˜์˜ ์—”ํŠธ๋ฆฌ๋ฅผ ํ•œ ๋ฒˆ์— ์‚ฝ์ž…ํ•˜๋Š” ๊ฒฝ์šฐ ๋งค์šฐ ๋น„ํšจ์œจ์ 
      • B +^++-tree ์ธ๋ฑ์Šค๊ฐ€ ํฐ relation์— ๊ตฌ์ถ•๋  ๋•Œ bulk loading์ด ํ•„์š”
  • ํšจ๊ณผ์ ์ธ ๋Œ€์•ˆ 1
    • ํšจ์œจ์ ์ธ external sorting ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๋ฅผ ์ •๋ ฌ
    • ์ •๋ ฌ๋œ ์ˆœ์„œ๋กœ ์‚ฝ์ž…
      • ํŠน์ • leaf node๋กœ ๊ฐ€๋Š” ๋ชจ๋“  ์—”ํŠธ๋ฆฌ๋Š” ์—ฐ์†์ ์œผ๋กœ ๋‚˜ํƒ€๋‚จ โ†’ leaf node๋Š” 1๋ฒˆ๋งŒ ์ถœ๋ ฅ๋˜๋ฉด ๋จ.
      • ๋” ํ–ฅ์ƒ๋œ I/O ์„ฑ๋Šฅ, ๊ทธ๋Ÿฌ๋‚˜ ๋Œ€๋ถ€๋ถ„์˜ leaf nodes๊ฐ€ half full
  • ํšจ๊ณผ์ ์ธ ๋Œ€์•ˆ 2: Bottom-up B +^++-tree ๊ตฌ์„ฑ
    • ์ด์ „์ฒ˜๋Ÿผ ์—”ํŠธ๋ฆฌ๋ฅผ ์ •๋ ฌ
    • ๊ทธ๋ฆฌ๊ณ  leaf level๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜์—ฌ layer-by-layer tree๋ฅผ ์ƒ์„ฑ
    • ์ •๋ ฌ๋œ ์—”ํŠธ๋ฆฌ๋ฅผ block์— fitํ•  ์ˆ˜ ์žˆ๋Š” ๋งŒํผ ๋งŽ์€ ์—”ํŠธ๋ฆฌ๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ blocks๋กœ Break up โ†’ ๊ฒฐ๊ณผ block์ด leaf level์„ ํ˜•์„ฑ
    • ๊ฐ block์˜ ์ตœ์†Ÿ๊ฐ’๊ณผ block์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋Š” ๋‹ค์Œ level์˜ ์—”ํŠธ๋ฆฌ๋ฅผ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ
    • ๋Œ€๋ถ€๋ถ„์˜ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์‹œ์Šคํ…œ์—์„œ bulk-load utility์˜ ์ผ๋ถ€๋กœ ๊ตฌํ˜„

Hash Indices

Static Hashing

  • Bucket: โ‰ฅ1\ge 1โ‰ฅ1 ์ธ๋ฑ์Šค ์—”ํŠธ๋ฆฌ๋ฅผ ํฌํ•จํ•˜๋Š” ์ €์žฅ ๋‹จ์œ„
  • ์—”ํŠธ๋ฆฌ์˜ search-key๊ฐ’์—์„œ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์—”ํŠธ๋ฆฌ์˜ bucket์„ ํš๋“
  • ํ•ด์‹œ ํ•จ์ˆ˜ hhh: ๋ชจ๋“  search-key๊ฐ’ ์ง‘ํ•ฉ KKK์—์„œ ๋ชจ๋“  bucket ์ฃผ์†Œ๋“ค์˜ ์ง‘ํ•ฉ BBB๋กœ์˜ ํ•จ์ˆ˜
  • ํ•ด์‹œ ํ•จ์ˆ˜๋Š” access, insertion, deletion์„ ์œ„ํ•œ ์—”ํŠธ๋ฆฌ๋ฅผ locateํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ
  • ๋‹ค๋ฅธ search-key๊ฐ’์„ ๊ฐ€์ง„ ์—”ํŠธ๋ฆฌ๊ฐ€ ๊ฐ™์€ bucket์— ๋งคํ•‘๋  ์ˆ˜ ์žˆ์Œ. ๋”ฐ๋ผ์„œ entry๋ฅผ ์ฐพ๊ธฐ ์œ„ํ•ด ์ „์ฒด bucket์„ ์ˆœ์ฐจ์ ์œผ๋กœ ํƒ์ƒ‰ํ•ด์•ผ ํ•จ.

Static Hashing

  • ํ•ด์‹œ ์ธ๋ฑ์Šค์—์„œ, buckets์€ ๋ ˆ์ฝ”๋“œ์— ๋Œ€ํ•œ ํฌ์ธํ„ฐ๋ฅผ ๊ฐ€์ง„ ์—”ํŠธ๋ฆฌ๋ฅผ ์ €์žฅ(์ฆ‰, buckets์ด index entries๋ฅผ ์ €์žฅ)
  • In a hash file-organization, buckets์€ ๋ ˆ์ฝ”๋“œ๋ฅผ ์ €์žฅ
  • Bucket overflow๋Š” ๋‹ค์Œ์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Œ.
    • ๋ถˆ์ถฉ๋ถ„ํ•œ buckets
    • ๋ ˆ์ฝ”๋“œ ๋ถ„๋ฐฐ์—์„œ์˜ Skew. ๋‘ ๊ฐ€์ง€ ์ด์œ ๋กœ ๋ฐœ์ƒ ๊ฐ€๋Šฅ
      • ๋‹ค์ค‘ ๋ ˆ์ฝ”๋“œ๊ฐ€ ๊ฐ™์€ search-key๊ฐ’์„ ๊ฐ€์ง.
      • ์„ ํƒ๋œ hash ํ•จ์ˆ˜๊ฐ€ key๊ฐ’์˜ non-uniform ๋ถ„ํฌ๋ฅผ ์ƒ์„ฑ
  • Bucket overflow์˜ ํ™•๋ฅ ์€ ์ค„์ผ ์ˆ˜ ์žˆ์ง€๋งŒ, ์™„์ „ํžˆ ์ œ๊ฑฐํ•  ์ˆ˜ ์—†์Œ.
    • Overflow buckets์„ ์‚ฌ์šฉํ•˜์—ฌ ์ฒ˜๋ฆฌ

Hash Functions

  • ํ•ด์‹œ ํ•จ์ˆ˜๋Š” uniform + random์ด ์š”๊ตฌ๋จ.
  • Uniform(๊ท ์ผ): ๊ฐ bucket์— ๋ชจ๋“  ์ด๋ก ์ ์œผ๋กœ ๊ฐ€๋Šฅํ•œ ๊ฐ’๋“ค์˜ ์ง‘ํ•ฉ์—์„œ ๊ฐ™์€ ์ˆ˜์˜ search-key๊ฐ’์ด ํ• ๋‹น๋จ.
  • Random(๋ฌด์ž‘์œ„): ๊ฐ bucket์€ ํŒŒ์ผ ๋‚ด search-key๊ฐ’์˜ ์‹ค์ œ ๋ถ„ํฌ์— irrespective, ๊ฐ™์€ ์ˆ˜์˜ ์—”ํŠธ๋ฆฌ์— ํ• ๋‹น๋จ.
  • ์ „ํ˜•์ ์ธ ํ•ด์‹œ ํ•จ์ˆ˜๋Š” search-key์˜ internal binary ํ‘œํ˜„์— ๋Œ€ํ•œ ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰
    • ์˜ˆ: ๋ฌธ์ž์—ด search-key์˜ ๊ฒฝ์šฐ, ๋ฌธ์ž์—ด์˜ ๋ชจ๋“  ๋ฌธ์ž๋“ค์˜ ์ด์ง„ ํ‘œํ˜„์„ ์ถ”๊ฐ€ํ•˜๊ณ , sum modulo the number of buckets๋ฅผ ๋ฐ˜ํ™˜

Handling of Bucket Overflows

  • Overflow chaining: ์ฃผ์–ด์ง„ bucket์˜ overflow buckets์€ linked list๋กœ chained together
  • Overflow chaining์„ ์‚ฌ์šฉํ•˜๋Š” ํ•ด์‹œ ์ธ๋ฑ์‹ฑ: closed addressing (๋˜๋Š” closed hasing)
    • Overflow buckets์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š” ๋Œ€์•ˆ์ ์ธ open addressing์€ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์ ํ•ฉํ•˜์ง€ ์•Š์Œ.

Deficiencies of Static Hashing

  • Static hashing์—์„œ, ํ•จ์ˆ˜ hhh๋Š” search-key๊ฐ’์„ fixed set(BBB)์˜ bucket ์ฃผ์†Œ๋กœ ๋งคํ•‘. ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋Š” ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ grow or shrink
    • Initial number of buckets๊ฐ€ ๋„ˆ๋ฌด ์ ๊ณ , ํŒŒ์ผ์ด ์ปค์ง์— ๋”ฐ๋ผ ๋„ˆ๋ฌด ๋งŽ์€ overflows๋กœ ์ธํ•ด ์„ฑ๋Šฅ ๊ฐ์†Œ
    • ์˜ˆ์ธก๋œ ์ฆ๊ฐ€๋ถ„์„ ์œ„ํ•ด ๊ณต๊ฐ„์ด ํ• ๋‹น๋˜๋ฉด, ์ดˆ๊ธฐ์— ์ƒ๋‹นํ•œ ์–‘์˜ ๊ณต๊ฐ„์ด ๋‚ญ๋น„๋จ.)
    • Database๊ฐ€ ์ถ•์†Œ๋˜๋ฉด, ๋‹ค์‹œ ๊ณต๊ฐ„์ด ๋‚ญ๋น„๋จ.
  • One solution
    • ์ƒˆ๋กœ์šด ํ•ด์‹œ ํ•จ์ˆ˜๋กœ ํŒŒ์ผ์„ ์ฃผ๊ธฐ์ ์œผ๋กœ reorganize
    • Expensive, normal ์—ฐ์‚ฐ์„ ํ˜ผ๋ž€์‹œํ‚ด.
  • ๋” ๋‚˜์€ ํ•ด๊ฒฐ์ฑ…: Dynamic Hashing
    • Buckets์˜ number๋ฅผ ๋™์ ์œผ๋กœ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋ฒ•
    • ์‚ฌ์ด์ฆˆ๊ฐ€ ์„ฑ์žฅํ•˜๊ณ  ์ถ•์†Œ๋˜๋Š” ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ์ ํ•ฉ
    • ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ๋™์ ์œผ๋กœ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ํ—ˆ์šฉ
    • Extendable hashing: dynamic hashing์˜ ํ•œ ํ˜•ํƒœ

Comparison of Ordered Indexing and Hashing

  • ์ฃผ๊ธฐ์  re-organization์˜ ๋น„์šฉ
  • ์‚ฝ์ž…/์‚ญ์ œ์˜ ์ƒ๋Œ€์  ๋นˆ๋„
  • Is it desirable to optimize average access time at the expense of worst-case access time?
  • Expected type of queries
    • Hashing์€ ์ผ๋ฐ˜์ ์œผ๋กœ key์˜ ๊ตฌ์ฒด์ ์ธ ๊ฐ’์„ ๊ฐ€์ง„ ๋ ˆ์ฝ”๋“œ๋ฅผ retrievingํ•˜๋Š” ๋ฐ ์ ํ•ฉ
    • Range queries๊ฐ€ common, ์ •๋ ฌ๋œ ์ธ๋ฑ์Šค๊ฐ€ ์„ ํ˜ธ๋จ.
  • ์‹ค์ œ๋กœ๋Š”
    • PostgreSQL์€ hash ์ธ๋ฑ์Šค๋ฅผ ์ง€์›ํ•˜์ง€๋งŒ, ๋‚ฎ์€ ์„ฑ๋Šฅ์œผ๋กœ ์ธํ•ด ์ž˜ ์‚ฌ์šฉ๋˜์ง€ ์•Š์Œ.
    • Oracle์€ static hash organization์„ ์ง€์›ํ•˜์ง€๋งŒ, not hash indices
    • SQLServer๋Š” B +^++-trees๋งŒ์„ ์ง€์›
์ตœ๊ทผ ์ˆ˜์ •: 26. 1. 3. ์˜คํ›„ 9:57
Contributors: kmbzn
Prev
13. Data Storage Structures
Next
15. Query Processing

BUILT WITH

CloudflareNode.jsGitHubGitVue.jsJavaScriptVSCodenpm

All trademarks and logos are property of their respective owners.
ยฉ 2026 kmbzn ยท MIT License