Translation

The oldest posts, are written in Italian. If you are interested and you want read the post in English, please use Google Translator. You can find it on the right side. If the translation is wrong, please email me: I'll try to translate for you.

mercoledì, giugno 26, 2024

MATCH_RECOGNIZE: an overview

MATCH_RECOGNIZE is an analytical function that allows you to identify and extract specific patterns within data sequences. It works by analyzing the ordered rows of a table and looking for matches with a user-defined pattern.

 MATCH_RECOGNIZE has 5 sections

  • PARTITION BY: Divide data into separate sequences for analysis
  • ORDER BY: Specifies the order in which lines are processed within each partition
  • MEASURES: Specifies calculations or aggregations to be performed on lines that match the pattern
  • PATTERN: Defines the pattern you want to match using regular expressions
  • DEFINE: Assign meaning to variables used in the pattern

 This is a typical statement using MATCH_RECOGNIZE


SELECT *
FROM MOVEMENTS
MATCH_RECOGNIZE (
    PARTITION BY prog_awp_id
    ORDER     BY move_date

    MEASURES
          CLASSIFIER() AS cls_same_ub

    ALL ROWS PER MATCH
    AFTER MATCH SKIP PAST LAST ROW
    PATTERN (init same_ub*)
    DEFINE
        same_ub AS (location_id = PREV(location_id))

)

In this post, I want to describe the order of elaboration of these steps.

First of all, you can summarize the order of application following this schema (in my example the WHERE condition is not present)

 

 In more details, you can visualize it like this (in the next picture I don't show the columns and table name: just the reserved words)

  

 In the order:

  1. The data must be partitioned (PARTITION BY), if they are logically grouped, and ordered (ORDER BY) according to a predetermined criterion (usually temporal)
  2. We must look for the correct sequence (PATTERN) based on the definition (DEFINE) of the "pattern" that we are looking for
  3. Determine the desired output type (ALL ROWS or ONE ROW). Here I'm using ALL ROWS, but it is possible to use ONE ROW that show only one row satisfying the "match" PATTERN
  4. Any information to show (MEASURES)
  5. What to do after a sequence has been detected (AFTER MATCH)

Reference:

  • https://www.slideshare.net/slideshow/row-pattern-matching-in-oracle-database-12c/66577835
  • https://oracle-base.com/articles/12c/pattern-matching-in-oracle-database-12cr1
  • https://www.oracle.com/technetwork/database/bi-datawarehousing/mr-deep-dive-3769287.pdf
  • https://docs.oracle.com/database/121/DWHSG/pattern.htm#DWHSG8956
  • https://www.oracle.com/webfolder/technetwork/tutorials/obe/db/12c/r1/misc/sql_pattern/sql_pattern.html
  • https://books.apple.com/us/book/complete-guide-to-sql-pattern-matching-volume-1/id1302139558

Nessun commento: