Más sobre FROM en Teradata

Ya vimos aquí algunas de las particularidades que presenta la sintaxis Teradata (no-ANSI) con los ‘SELECTs’ y los ‘FROM’. Pero la cosa no acaba ahí: ‘UPDATE’ y ‘DELETE’ también permiten variaciones sobre el mismo tema.

Veamos primero ‘UPDATE’.

Teradata documenta el uso de ‘JOINS’ implícitas en ‘UPDATEs’:

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DOS
          3  TRES
          1  UNO

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA03;

 *** Query completed. 2 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          1  RAS

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
UPDATE MY_DB.PRUEBA01
   SET C_TXT=MY_DB.PRUEBA03.C_TXT
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Update completed. 2 rows changed.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          3  TRES
          1  RAS

Y también ‘JOINS’ explícitas (en otros ‘RDBMS’ a esto se le llama ‘UPDATE FROM’):

SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DOS
          3  TRES
          1  UNO

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA03;

 *** Query completed. 2 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          1  RAS

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
UPDATE MY_DB.PRUEBA01
  FROM MY_DB.PRUEBA03 b
   SET C_TXT=b.C_TXT
 WHERE MY_DB.PRUEBA01.ID_N=b.ID_N;

 *** Update completed. 2 rows changed.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          3  TRES
          1  RAS

Entonces, ¿qué pasará si la tabla del ‘FROM’ es distinta a la del ‘WHERE’?

SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DOS
          3  TRES
          1  UNO

 BTEQ -- Enter your DBC/SQL request or BTEQ command:

SELECT * FROM MY_DB.PRUEBA02;

 *** Query completed. 5 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  TWO
          5  FIVE
          4  FOUR
          1  ONE
          3  THREE

 BTEQ -- Enter your DBC/SQL request or BTEQ command:

SELECT * FROM MY_DB.PRUEBA03;

 *** Query completed. 2 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          1  RAS

 BTEQ -- Enter your DBC/SQL request or BTEQ command:

UPDATE MY_DB.PRUEBA01
  FROM MY_DB.PRUEBA02
   SET C_TXT=MY_DB.PRUEBA03.C_TXT
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Failure 7547 Target row updated by multiple source rows.
                Statement# 1, Info =0
 *** Total elapsed time was 1 second.

Ooops! ¿Qué ha pasado? Lo vemos mejor con un EXPLAIN:

EXPLAIN
UPDATE MY_DB.PRUEBA01
  FROM MY_DB.PRUEBA02
   SET C_TXT=MY_DB.PRUEBA03.C_TXT
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Help information returned. 36 rows.
 *** Total elapsed time was 1 second.

Explanation
--------------------------------------------------------------------------
  1) First, we lock a distinct MY_DB."pseudo table" for read on a
     RowHash to prevent global deadlock for MY_DB.PRUEBA03.
  2) Next, we lock a distinct MY_DB."pseudo table" for read on a
     RowHash to prevent global deadlock for MY_DB.PRUEBA02.
  3) We lock a distinct MY_DB."pseudo table" for write on a
     RowHash to prevent global deadlock for MY_DB.PRUEBA01.
  4) We lock MY_DB.PRUEBA03 for read, we lock MY_DB.PRUEBA02
     for read, and we lock MY_DB.PRUEBA01 for write.
  5) We do an all-AMPs JOIN step from MY_DB.PRUEBA01 by way of a
     RowHash match scan with no residual conditions, which is joined to
     MY_DB.PRUEBA03 by way of a RowHash match scan with no
     residual conditions.  MY_DB.PRUEBA01 and MY_DB.PRUEBA03
     are joined using a merge join, with a join condition of (
     "MY_DB.PRUEBA01.ID_N = MY_DB.PRUEBA03.ID_N").  The
     result goes into Spool 2 (all_amps), which is duplicated on all
     AMPs.  The size of Spool 2 is estimated with low confidence to be
     400 rows (11,200 bytes).  The estimated time for this step is 0.01
     seconds.
  6) We do an all-AMPs JOIN step from MY_DB.PRUEBA02 by way of an
     all-rows scan with no residual conditions, which is joined to
     Spool 2 (Last Use) by way of an all-rows scan.
     MY_DB.PRUEBA02 and Spool 2 are joined using a product join,
     with a join condition of ("(1=1)").  The result goes into Spool 1
     (all_amps), which is redistributed by the hash code of (
     MY_DB.PRUEBA01.ROWID) to all AMPs.  Then we do a SORT to
     order Spool 1 by the sort key in spool field1.  The size of Spool
     1 is estimated with low confidence to be 400 rows (11,200 bytes).
     The estimated time for this step is 0.01 seconds.
  7) We do a MERGE Update to MY_DB.PRUEBA01 from Spool 1 (Last
     Use) via ROWID.  The size is estimated with low confidence to be
     400 rows (10,800 bytes).  The estimated time for this step is 0.20
     seconds.
  8 ) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> No rows are returned to the user as the result of statement 1.
     The total estimated time is 0.22 seconds.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:

Como se ve, primero se hace un ‘JOIN’ entre PRUEBA01 y PRUEBA03 (las que importan) pero, al haber introducido PRUEBA02 en el ‘JOIN’, se produce un ‘CROSS JOIN’ de las dos filas del ‘JOIN’ de PRUEBA01 y PRUEBA03 con todas las filas de PRUEBA02 .Habrá exactamente 10 filas: las dos del primer ‘JOIN’ por las cinco de PRUEBA02 y serán iguales dos a dos, por eso la parte final (MERGE Update) falla con el error “Failure 7547 Target row updated by multiple source rows” (¿Dónde hemos visto esto antes?).

Por supuesto, el ‘UPDATE’ NO falla si sustituimos PRUEBA02 por PRUEBA03 (aunque sea redundante, ya que es el caso que vimos más arriba):

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
UPDATE MY_DB.PRUEBA01
  FROM MY_DB.PRUEBA03
   SET C_TXT=MY_DB.PRUEBA03.C_TXT
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Update completed. 2 rows changed.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          3  TRES
          1  RAS

Como tampoco fallará si la tabla “intrusa” del FROM tiene sólo una fila (no habrá ‘CROSS JOINS’ y por tanto no habrá múltiples filas que intentarán actualizar una misma fila):

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA04;

 *** Query completed. One row found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          1  EIN

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
UPDATE MY_DB.PRUEBA01
  FROM MY_DB.PRUEBA04
   SET C_TXT=MY_DB.PRUEBA03.C_TXT
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Update completed. 2 rows changed.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          3  TRES
          1  RAS

¿Y qué pasa con ‘DELETE’?

Pues ‘DELETE’ también permite ‘JOINS’ implícitas:

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DOS
          3  TRES
          1  UNO

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA03;

 *** Query completed. 2 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DVA
          1  RAS

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
DELETE FROM MY_DB.PRUEBA01
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Delete completed. 2 rows removed.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. One row found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          3  TRES

¿Y cómo funcionará aquí la “tabla intrusa” en el ‘FROM’?

SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 3 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DOS
          3  TRES
          1  UNO

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
DELETE FROM MY_DB.PRUEBA01,
            MY_DB.PRUEBA02
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Delete completed. 2 rows removed.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. One row found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          3  TRES

¿Cómo? ¿Y por qué no falla como el ‘UPDATE’?

The answer, my friend, is blowing in the EXPLAIN“, que diría Dylan:

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
EXPLAIN
DELETE FROM MY_DB.PRUEBA01,
            MY_DB.PRUEBA02
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA03.ID_N;

 *** Help information returned. 37 rows.
 *** Total elapsed time was 1 second.

Explanation
-----------------------------------------------------------------------
  1) First, we lock a distinct MY_DB."pseudo table" for read on a
     RowHash to prevent global deadlock for MY_DB.PRUEBA03.
  2) Next, we lock a distinct MY_DB."pseudo table" for read on a
     RowHash to prevent global deadlock for MY_DB.PRUEBA02.
  3) We lock a distinct MY_DB."pseudo table" for write on a
     RowHash to prevent global deadlock for MY_DB.PRUEBA01.
  4) We lock MY_DB.PRUEBA03 for read, we lock MY_DB.PRUEBA02
     for read, and we lock MY_DB.PRUEBA01 for write.
  5) We do an all-AMPs JOIN step from MY_DB.PRUEBA01 by way of a
     RowHash match scan with no residual conditions, which is joined to
     MY_DB.PRUEBA03 by way of a RowHash match scan with no
     residual conditions.  MY_DB.PRUEBA01 and MY_DB.PRUEBA03
     are joined using a merge join, with a join condition of (
     "MY_DB.PRUEBA01.ID_N = MY_DB.PRUEBA03.ID_N").  The
     result goes into Spool 2 (all_amps), which is duplicated on all
     AMPs.  The size of Spool 2 is estimated with low confidence to be
     400 rows (7,200 bytes).  The estimated time for this step is 0.01
     seconds.
  6) We do an all-AMPs JOIN step from MY_DB.PRUEBA02 by way of an
     all-rows scan with no residual conditions, which is joined to
     Spool 2 (Last Use) by way of an all-rows scan.
     MY_DB.PRUEBA02 and Spool 2 are joined using a product join,
     with a join condition of ("(1=1)").  The result goes into Spool 1
     (all_amps), which is redistributed by the hash code of (
     MY_DB.PRUEBA01.ROWID) to all AMPs.  Then we do a SORT to
     order Spool 1 by row hash and the sort key in spool field1
     eliminating duplicate rows.  The size of Spool 1 is estimated with
     low confidence to be 400 rows (7,200 bytes).  The estimated time
     for this step is 0.01 seconds.
  7) We do an all-AMPs MERGE DELETE to MY_DB.PRUEBA01 from Spool 1
     (Last Use) via the row id.  The size is estimated with low
     confidence to be 400 rows.  The estimated time for this step is
     6.80 seconds.
  8 ) We spoil the parser's dictionary cache for the table.
  9) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> No rows are returned to the user as the result of statement 1.

Ahora el paso 6) hace un ‘SORT’ con eliminación de duplicados (The result goes into Spool 1 (all_amps), which is redistributed by the hash code of (MY_DB.PRUEBA01.ROWID) to all AMPs. Then we do a SORT to order Spool 1 by row hash and the sort key in spool field1 eliminating duplicate rows).

Además, por el propio funcionamiento de ‘DELETE’ no se producirá un error si varias filas intentan borrar una misma fila:

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
CREATE MULTISET TABLE MY_DB.PRUEBA05 ,
     NO FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT
     (
      ID_N INTEGER NOT NULL,
      C_TXT VARCHAR(25) CHARACTER SET LATIN NOT CASESPECIFIC)
PRIMARY INDEX ( ID_N );

 *** Table has been created.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
INSERT INTO MY_DB.PRUEBA05 VALUES (1, 'UN');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
INSERT INTO MY_DB.PRUEBA05 VALUES (1, 'UN');

 *** Insert completed. One row added.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA05;

 *** Query completed. 2 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          1  UN
          1  UN

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
DELETE FROM MY_DB.PRUEBA01
 WHERE MY_DB.PRUEBA01.ID_N=MY_DB.PRUEBA05.ID_N;

 *** Delete completed. One row removed.
 *** Total elapsed time was 1 second.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT * FROM MY_DB.PRUEBA01;

 *** Query completed. 2 rows found. 2 columns returned.
 *** Total elapsed time was 1 second.

       ID_N  C_TXT
-----------  -------------------------
          2  DOS
          3  TRES

Las dos filas de PRUEBA05 han borrado una fila (la fila “1, ‘UNO'”) de PRUEBA01 sin errores.

Saludos.

Carlos.

Anuncios

Una respuesta a Más sobre FROM en Teradata

  1. […] un CROSS JOIN que es el causante de los resultados tan extraños (algo parecido a esto lo vimos aquí hace […]

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s

A %d blogueros les gusta esto: