Improve stored procedure logging performance for Amazon Redshift

3 minute read

Content level: Expert

Amazon Redshift is an MPP large-scale analytics warehouse, is not suited for large volumes of single-row insert, update, or delete operations. Stored procedures can generate 100's of single row inserts to log tables to track the process flow and progress of stored procedures. These single row inserts or updates tend to slow down the overall speed of stored procedure execution. This article will provide an alternative logging method which can improve the stored procedure execution performance.

Amazon Redshift is designed for processing large volumes of data to deriving business insights. It stores data in columnar format, in 1 MB immutable blocks. Since the data blocks are immutable, Redshift processes an Update as Delete + Insert. Each update results in the older block(s) being marked for deletion and new block(s) gets created with the new values.

Now lets take into account the columnar format where each column gets stored in separate blocks. So for a table with 10 columns, a single row update results in 20 MB (10 MB delete + 10 MB insert) of operation, and a single row insert results in 10 MB of operation.

And finally, all commits to database have to go in a single queue i.e. are serialized. Typically the step-level logging from Stored Procedure (SP) results in 100's to 1000's of single row insert/updates to some execution_status or like table which is intended to track the progress of the SP and handle audit checks. If you are running 100's of such SP's then the operations on the execution_status table slow down the overall SP execution time, leading to a poor experience.

Alternative logging method

Amazon Redshift provides the RAISE statement and you can leverage the RAISE INFO for step-level logging from stored procedures. And leverage RAISE EXCEPTION to fail a stored procedure when the audit checks fail. The RAISE INFO will log step-level messages to Redshift system table SYS_PROCEDURE_MESSAGES. This does not increase runtimes of your stored procedures by any significant margin.

Note

The MESSAGE column from SYS_PROCEDURE_MESSAGES is restricted to 1024 characters, so make sure to stay within this limit with your custom logging text pattern.

SQL to extract logs

You can leverage the below SQL to extract elements from your custom logging text.

select
  record_time,
  query_id,
  SPLIT_PART(message, '|', 1) AS procedure_name,
  SPLIT_PART(message, '|', 2) AS step_name,
  SPLIT_PART(message, '|', 3) AS table_name,
  SPLIT_PART(message, '|', 4) AS operation_name,
  SPLIT_PART(message, '|', 5) AS status_text,
  SPLIT_PART(message, '|', 6) AS start_timestamp,
  SPLIT_PART(message, '|', 7) AS end_timestamp
from SYS_PROCEDURE_MESSAGES
where log_level = 'INFO'
and procedure_name = 'sp_load_sales_fct'
and record_time >= TRUNC(SYSDATE)
and message is not null
order by record_time;

You can also choose save the stored procedure execution log to your actual final logging table at a later time, every few hours or every day, using a single INSERT INTO SELECT operation which will be much faster than individual row-by-row inserts.

Topics

Analytics

Relevant content

Master your data quality in Amazon Redshift
EXPERT
Sean Beath
published 3 months ago
Automated change data capture (CDC) data ingestion from DynamoDB to Redshift
EXPERT
Eesha Kumar
published 10 months ago
Best Practices for loading Electronic Health Claims Data to Amazon Redshift
EXPERT
Phil Bates
published a year ago
Performance of Redshift Stored Procedure is not upto mark
AWS-User-0658229
asked 2 years ago
Programmatically Running a Stored Procedure in Redshift
jbw12
asked 5 years ago
Returning a temporary table in Stored Procedure in Redshift
Accepted Answer
faabrisgro
asked 7 months ago
How do I troubleshoot cluster or query performance issues in Amazon Redshift?
AWS OFFICIALUpdated 8 months ago
How do I troubleshoot performance issues with VACUUM in Amazon Redshift?
AWS OFFICIALUpdated 2 years ago
How do I turn on audit logging in Amazon Redshift and Amazon Redshift Serverless?
AWS OFFICIALUpdated 2 years ago
How do I resolve data incompatibility errors in Redshift Spectrum?
AWS OFFICIALUpdated 2 years ago