DAY OF WEEK en Teradata

Teradata no tiene una función DAY_OF_WEEK() o algún formato de fecha que facilite el cálculo como, por ejemplo, en Oracle.

En efecto, en Oracle un simple TO_CHAR(,’D’) hace el trabajo (teniendo en cuenta que el día de la semana depende de los parámetros NLS: El 1 es ‘sunday’ para los anglosajones, mientras que el 1 es ‘lunes’ para otras latitudes y para el estándar ISO 8601).

SQL*Plus: Release 10.2.0.1.0 - Production on Jue Jun 24 10:31:09 2010

Copyright (c) 1982, 2005, Oracle.  All rights reserved.

Introduzca el nombre de usuario: carlos@XE.localhost
Introduzca la contraseña:

Conectado a:
Oracle Database 10g Express Edition Release 10.2.0.1.0 - Production


CARLOS@XE.localhost> SELECT SYSDATE, TO_CHAR(SYSDATE,'D') FROM DUAL;

SYSDATE             T
------------------- -
2010/06/24 10:33:05 4

CARLOS@XE.localhost>

Teradata soporta un formato (en realidad, dos) para los días de la semana, pero devuelve un literal del nombre en vez de un número.

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT DATE, DATE (FORMAT 'EEE'), DATE (FORMAT 'EEEE');


 *** Query completed. One row found. 3 columns returned.
 *** Total elapsed time was 1 second.

Current Date  Current Date  Current Date
------------  ------------  ------------
  2010-06-24           Thu     Thursday

Si queremos un formato numérico para el día de la semana, podemos acudir a SYS_CALENDAR.CALENDAR, que tiene una columna al efecto: day_of_week.

SELECT DATE,
       DAY_OF_WEEK
  FROM SYS_CALENDAR.CALENDAR
 WHERE CALENDAR_DATE = DATE;


 *** Query completed. One row found. 2 columns returned.
 *** Total elapsed time was 1 second.

Current Date  day_of_week
------------  -----------
  2010-06-24            5

Aquí vemos que Teradata sigue el formato anglosajón de 1 = ‘sunday’.

Pero hay otra forma para calcular el día de la semana.

Vamos a aprovechar el hecho de que el 1 de enero de 1900 fue lunes (aunque podríamos haber elegido cualquier otro lunes):


 BTEQ -- Enter your DBC/SQL request or BTEQ command:
SELECT DATE '1900-01-01' (FORMAT 'EEEE');


 *** Query completed. One row found. One column returned.
 *** Total elapsed time was 1 second.

1900-01-01
----------
 Monday

Sabiendo esto, sólo hace falta saber un poco de matemáticas para hacer:

 BTEQ -- Enter your DBC/SQL request or BTEQ command:
select DATE, ((DATE - date '1900-01-01') mod 7) + 1 day_of_week;


 *** Query completed. One row found. 2 columns returned.
 *** Total elapsed time was 1 second.

Current Date  day_of_week
------------  -----------
  2010-06-24            4

Quizá parece algo simple, pero puede tener su utilidad. Si tenemos una tabla grande, la diferencia puede llegar a ser significativa:

SELECT COUNT(1) FROM MY_DB.PRUEBA20;


 *** Query completed. One row found. One column returned.
 *** Total elapsed time was 28 seconds.

   Count(1)
-----------
  112607372

Si hacemos un conteo agrupado por día de la semana basándonos en una ‘join’ con SYS_CALENDAR.CALENDAR tenemos:

select b.day_of_week,
       count(1) total
  from my_db.prueba20 a,
       sys_calendar.calendar b
 where a.f_fecha = b.calendar_date
 group by b.day_of_week
 order by day_of_week
;


 *** Query completed. 7 rows found. 2 columns returned.
 *** Total elapsed time was 2 minutes and 15 seconds.

day_of_week        total
-----------  -----------
          1      7307389
          2     20884684
          3     20387173
          4     20550288
          5     20188659
          6     16146929
          7      7142250

Pero si utilizamos la aritmética de fechas que hemos explicado arriba, tenemos:

select ((f_fecha - date '1900-01-01') mod 7) + 1 day_of_week,
       count(1) total
  from my_db.prueba20
  group by day_of_week
  order by day_of_week
;


 *** Query completed. 7 rows found. 2 columns returned.
 *** Total elapsed time was one minute and 6 seconds.

day_of_week        total
-----------  -----------
          1     20884684
          2     20387173
          3     20550288
          4     20188659
          5     16146929
          6      7142250
          7      7307389

Es decir, la mitad de tiempo.

Los ‘EXPLAIN’ explican muy bien el porqué:

explain
select b.day_of_week,
       count(1) total
  from my_db.prueba20 a,
       sys_calendar.calendar b
 where a.f_fecha = b.calendar_date
 group by b.day_of_week
 order by day_of_week
;


 *** Help information returned. 64 rows.
 *** Total elapsed time was 1 second.

Explanation
--------------------------------------------------------------------------------
  1) First, we lock a distinct dw_dgt."pseudo table" for read on a
     RowHash to prevent global deadlock for dw_dgt.a.
  2) Next, we lock a distinct SYS_CALENDAR."pseudo table" for read on a
     RowHash to prevent global deadlock for SYS_CALENDAR.CALDATES.
  3) We lock dw_dgt.a for read, and we lock SYS_CALENDAR.CALDATES in
     view calendar for read.
  4) We do an all-AMPs RETRIEVE step from SYS_CALENDAR.CALDATES in view
     calendar by way of an all-rows scan with no residual conditions
     into Spool 7 (all_amps), which is duplicated on all AMPs.  The
     size of Spool 7 is estimated with high confidence to be 1,468,280
     rows (24,960,760 bytes).  The estimated time for this step is 0.16
     seconds.
  5) We do an all-AMPs JOIN step from Spool 7 (Last Use) by way of an
     all-rows scan, which is joined to dw_dgt.a by way of an all-rows
     scan with no residual conditions.  Spool 7 and dw_dgt.a are joined
     using a single partition hash_ join, with a join condition of (
     "dw_dgt.a.F_FECHA = cdate").  The input table dw_dgt.a will not be
     cached in memory.  The result goes into Spool 6 (all_amps), which
     is built locally on the AMPs.  The result spool file will not be
     cached in memory.  The size of Spool 6 is estimated with low
     confidence to be 112,607,372 rows (2,139,540,068 bytes).  The
     estimated time for this step is 1 minute and 23 seconds.
  6) We do an all-AMPs SUM step to aggregate from Spool 6 (Last Use) by
     way of an all-rows scan , grouping by field1 ( ((( CASE WHEN
     (((SYS_CALENDAR.CALDATES.cdate MOD 10000 )/ 100 )> 2) THEN
     ((((((146097 * (((SYS_CALENDAR.CALDATES.cdate / 10000 )+ 1900 )/
     100 ))/ 4 )+ ((1461 * (((SYS_CALENDAR.CALDATES.cdate / 10000 )+
     1900 )- ((((SYS_CALENDAR.CALDATES.cdate / 10000 )+ 1900 )/ 100 )*
     100 )))/ 4 ))+ (((153 * (((SYS_CALENDAR.CALDATES.cdate MOD 10000
     )/ 100 )- 3 ))+ 2 )/ 5 ))+ (SYS_CALENDAR.CALDATES.cdate MOD 100
     ))- 693901) ELSE ((((((146097 * ((((SYS_CALENDAR.CALDATES.cdate /
     10000 )+ 1900 )- 1 )/ 100 ))/ 4 )+ ((1461 *
     ((((SYS_CALENDAR.CALDATES.cdate / 10000 )+ 1900 )- 1 )-
     (((((SYS_CALENDAR.CALDATES.cdate / 10000 )+ 1900 )- 1 )/ 100 )*
     100 )))/ 4 ))+ (((153 * (((SYS_CALENDAR.CALDATES.cdate MOD 10000
     )/ 100 )+ 9 ))+ 2 )/ 5 ))+ (SYS_CALENDAR.CALDATES.cdate MOD 100
     ))- 693901) END )) MOD 7 )+ 1).  Aggregate Intermediate Results
     are computed globally, then placed in Spool 8.  The size of Spool
     8 is estimated with no confidence to be 1,048,576 rows (
     26,214,400 bytes).  The estimated time for this step is 1 minute
     and 33 seconds.
  7) We do an all-AMPs RETRIEVE step from Spool 8 (Last Use) by way of
     an all-rows scan into Spool 4 (group_amps), which is built locally
     on the AMPs.  Then we do a SORT to order Spool 4 by the sort key
     in spool field1 (((( CASE WHEN (((SYS_CALENDAR.CALDATES.cdate MOD
     10000 )/ 100 )> 2) THEN ((((((146097 *
     (((SYS_CALENDAR.CALDATES.cdate / 10000 )+ 1900 )/ 100 ))/ 4 )+
     ((1461 * (((SYS_CALENDAR.CALDATES.cdate / 10000 )+ 1900 )-
     ((((SYS_CALENDAR.CALDATES.cdate / 10000 )+ 1900 )/ 100 )* 100 )))/
     4 ))+ (((153 * (((SYS_CALENDAR.CALDATES.cdate MOD 10000 )/ 100 )-
     3 ))+ 2 )/ 5 ))+ (SYS_CALENDAR.CALDATES.cdate MOD 100 ))- 693901)
     ELSE ((((((146097 * ((((SYS_CALENDAR.CALDATES.cdate / 10000 )+
     1900 )- 1 )/ 100 ))/ 4 )+ ((1461 * ((((SYS_CALENDAR.CALDATES.cdate
     / 10000 )+ 1900 )- 1 )- (((((SYS_CALENDAR.CALDATES.cdate / 10000
     )+ 1900 )- 1 )/ 100 )* 100 )))/ 4 ))+ (((153 *
     (((SYS_CALENDAR.CALDATES.cdate MOD 10000 )/ 100 )+ 9 ))+ 2 )/ 5
     ))+ (SYS_CALENDAR.CALDATES.cdate MOD 100 ))- 693901) END )) MOD 7
     )+ 1).  The size of Spool 4 is estimated with no confidence to be
     1,048,576 rows (45,088,768 bytes).  The estimated time for this
     step is 0.52 seconds.
  8 )Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 4 are sent back to the user as the result of
     statement 1.  The total estimated time is 2 minutes and 57 seconds.


 BTEQ -- Enter your DBC/SQL request or BTEQ command:
explain
select ((f_fecha - date '1900-01-01') mod 7) + 1 day_of_week,
       count(1) total
  from my_db.prueba20
  group by day_of_week
  order by day_of_week
;


 *** Help information returned. 22 rows.
 *** Total elapsed time was 1 second.

Explanation
-------------------------------------------------------------------------
  1) First, we lock a distinct dw_dgt."pseudo table" for read on a
     RowHash to prevent global deadlock for my_db.prueba20.
  2) Next, we lock my_db.prueba20 for read.
  3) We do an all-AMPs SUM step to aggregate from my_db.prueba20 by
     way of an all-rows scan with no residual conditions
     , grouping by field1 ( (((my_db.prueba20.F_FECHA )- (DATE
     '1900-01-01')) MOD 7 )+ 1).  Aggregate Intermediate Results are
     computed globally, then placed in Spool 3.  The input table will
     not be cached in memory, but it is eligible for synchronized
     scanning.  The size of Spool 3 is estimated with low confidence to
     be 583 rows (14,575 bytes).  The estimated time for this step is 1
     minute and 5 seconds.
  4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of
     an all-rows scan into Spool 1 (group_amps), which is built locally
     on the AMPs.  Then we do a SORT to order Spool 1 by the sort key
     in spool field1 (my_db.prueba20.Field_2).  The size of Spool 1
     is estimated with low confidence to be 583 rows (25,069 bytes).
     The estimated time for this step is 0.01 seconds.
  5) Finally, we send out an END TRANSACTION step to all AMPs involved
     in processing the request.
  -> The contents of Spool 1 are sent back to the user as the result of
     statement 1.  The total estimated time is 1 minute and 5 seconds.

En la fórmula, hemos adoptado 1 = ‘lunes’, pero se puede transformar a formato ‘anglo’ con una simple traslación de un día.

Saludos.

Carlos.

5 respuestas a DAY OF WEEK en Teradata

  1. PAul dice:

    Como siempre, muy interesante 🙂

  2. OSCAR dice:

    Hola
    Teradata proporciona todas las siguientes funciones de calculo de dias/semanas/meses
    CURDATE() CURTIME() DAYOFMONTH()
    CURRENT_DATE() CURRENT_TIME() CURRENT_TIMESTAMP()
    DAYOFWEEK() DAYOFYEAR() HOUR()
    MINUTE() MONTH() MONTHNAME()
    NOW() QUARTER() SECOND()
    YEAR() WEEK() TIMESTAMPADD()
    TIMESTAMPDIFF() EXTRACT()

    Como ves entre ellas esta la que citas
    Saludos

    • carlosal dice:

      No.

      Teradata no tiene tales funciones. Seguramente te estás refiriendo a las funciones del ‘driver’ ODBC de Teradata, que es otra cosa muy diferente. Esas funciones son EXTERNAS a Teradata y están en la capa intermedia ODBC. Si intentas utilizarlas con bteq, o con un cliente jdbc, por ejemplo, verás que te devuelven un error (al igual que si inhabilitas las funciones ODBC en el SQL Assistant).

      Saludos.

      Carlos.

  3. OSCAR dice:

    Tienes razon CLI no soporta estas funciones de ODBC , ni siquieran llegan a PE , pero creo que el titulo no es del todo justo.

Responder

Introduce tus datos o haz clic en un icono para iniciar sesión:

Logo de WordPress.com

Estás comentando usando tu cuenta de WordPress.com. Cerrar sesión / Cambiar )

Imagen de Twitter

Estás comentando usando tu cuenta de Twitter. Cerrar sesión / Cambiar )

Foto de Facebook

Estás comentando usando tu cuenta de Facebook. Cerrar sesión / Cambiar )

Google+ photo

Estás comentando usando tu cuenta de Google+. Cerrar sesión / Cambiar )

Conectando a %s

A %d blogueros les gusta esto: