Price/Performance of HANA, Exadata, Teradata, and Greenplum
Price/Performance of HANA, Exadata, Teradata, and Greenplum.
HANA has the lowest latency because it is in-memory. When there are two nodes a penalty is paid for crossing the network… this makes sense.
Exadata does well because the X3 product has SSD cache and I assumed an 80% hit ratio.
Teradata does a little worse because I assumed a lower hit ratio (they have less SSD per TB of data).
Greenplum does worse as they do all I/O against disks.
Teradata Complex CASE_N MLPPI
In addition to RANGE_N partitioning mechanism, CASE_N provides a good choice to distribute data with complex splitting criteria. It is much harder to effectively express the following sample situation with ORACLE’s range-based partitioning syntax.
- FACT_WEB_CLICKS is a huge click-stream table
- 40% of clicks are generated by non-members or unsigned-in sessions
- 50% of the non-member clicks are generated by robots and crawlers
- 40% of the non-member clicks are associated with the home page (PAGE_ID = 12345)
- 60% of clicks are generated by members or signed-in sessions
- 40% of the member clicks go to the 15 most popular PAGE_IDs
- 50% of the traffic has TRACKING_ID populated, the other 50% does not
- 2% of the member clicks are generated by robots or crawlers
- For non-member clicks, negative random MEMBER_ID were generated to avoid PI skewness
- Most analytical queries only care about the clicks with MEMBER_ID > 0
- PAGE_ID is highly skewed, because the popular pages have a lot more clicks than the long tail
- Some analytical queries only care about the clicks with TRACKING_ID > 0
- Daily partition at the first level. This is commented out in the sample DDL, so we can map each partition’s row count to “PARTITION ID” later
DDL1: Catch-All Bucket without Explicit Criteria
CREATE MULTISET TABLE FACT_WEB_CLICKS ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, FREESPACE = 0 PERCENT, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO ( WEB_CLICK_ID BIGINT NOT NULL, CLICK_DATE DATE FORMAT 'YYYY-MM-DD', CLICK_TIME TIMESTAMP(3), MEMBER_ID BIGINT NOT NULL, PAGE_ID INTEGER NOT NULL, SESSION_ID BIGINT NOT NULL, TRACKING_ID INTEGER NOT NULL, CLICK_SEQ INTEGER, LOCALE_ID INTEGER NOT NULL, TOTAL_TIME INTEGER, PAGE_VIEW_SEQ INTEGER, IS_CRAWLER CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT 'N') PRIMARY INDEX ( MEMBER_ID ) PARTITION BY ( --RANGE_N(CLICK_DATE BETWEEN DATE '2012-07-01' AND DATE + 10 EACH INTERVAL '1' DAY ), CASE_N( MEMBER_ID <= 0 AND IS_CRAWLER = 'Y', MEMBER_ID <= 0 AND IS_CRAWLER = 'N' AND PAGE_ID = 12345 , MEMBER_ID <= 0 AND IS_CRAWLER = 'N', -- catch_all_bucket for other PAGE_ID(s) MEMBER_ID > 0 AND PAGE_ID IN (12345 ,12355 ), MEMBER_ID > 0 AND PAGE_ID IN (20001 ,20010 ,20011 ), MEMBER_ID > 0 AND PAGE_ID IN (70001 ,70008 ), MEMBER_ID > 0 AND PAGE_ID IN (80001 ,80002 ), MEMBER_ID > 0 AND PAGE_ID IN (90001 ,90002 ), MEMBER_ID > 0 AND PAGE_ID IN (60001 ,60002 ), MEMBER_ID > 0 AND PAGE_ID IN (50001 ,50002 ), MEMBER_ID > 0 -- catch_all_bucket for other PAGE_ID(s) ), CASE_N(TRACKING_ID <= 0 , TRACKING_ID > 0 ) );
Data Distribution Among Each Partition
select Partition, count(*) from FACT_WEB_CLICKS group by 1 order by 1;
The data are distributed among each partition as expected. But the query optimizer cannot take full advantage of the partition elimination yet.
Explain Plan
collect stats FACT_WEB_CLICKS column(partition); collect stats FACT_WEB_CLICKS index(member_id); collect stats FACT_WEB_CLICKS column(page_id); collect stats FACT_WEB_CLICKS column(partition,member_id); DIAGNOSTIC VERBOSEEXPLAIN ON FOR SESSION; DIAGNOSTIC HELPSTATS ON FOR SESSION; explain select * from FACT_WEB_CLICKS where member_id > 0 and tracking_id > 0 and page_id in (70001) and click_date = date '2013-02-05'; Explanation ----------------------------------------------------------------------- 1) First, we lock a distinct fact_web."pseudo table" for read on a RowHash to prevent global deadlock for fact_web.FACT_WEB_CLICKS. 2) Next, we lock fact_web.FACT_WEB_CLICKS for read. 3) We do an all-AMPs RETRIEVE step from 2 partitions of fact_web.FACT_WEB_CLICKS with a condition of ( "(fact_web.FACT_WEB_CLICKS.CLICK_DATE = DATE '2013-02-05') AND ((fact_web.FACT_WEB_CLICKS.PAGE_ID = 70001) AND ((fact_web.FACT_WEB_CLICKS.MEMBER_ID > 0) AND (fact_web.FACT_WEB_CLICKS.TRACKING_ID > 0)))") into Spool 1 (group_amps), which is built locally on the AMPs with Field1 ("UniqueId"). The size of Spool 1 is estimated with no confidence to be 116,833,856 rows (14,253,730,432 bytes). Spool Asgnlist:
With the precise filter clause, the explain plan should hit only 1 partition (partition 12 in this case), but the actual plan also wants to scan the catch-all partition 22. There should be a way to improve partition elimination.
DDL2: Catch-All Bucket with Explicit Criteria
CREATE MULTISET TABLE FACT_WEB_CLICKS2 ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, FREESPACE = 0 PERCENT, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO ( WEB_CLICK_ID BIGINT NOT NULL, CLICK_DATE DATE FORMAT 'YYYY-MM-DD', CLICK_TIME TIMESTAMP(3), MEMBER_ID BIGINT NOT NULL, PAGE_ID INTEGER NOT NULL, SESSION_ID BIGINT NOT NULL, TRACKING_ID INTEGER NOT NULL, CLICK_SEQ INTEGER, LOCALE_ID INTEGER NOT NULL, TOTAL_TIME INTEGER, PAGE_VIEW_SEQ INTEGER, IS_CRAWLER CHAR(1) CHARACTER SET LATIN NOT CASESPECIFIC DEFAULT 'N') PRIMARY INDEX ( MEMBER_ID ) PARTITION BY ( --RANGE_N(CLICK_DATE BETWEEN DATE '2012-07-01' AND DATE + 10 EACH INTERVAL '1' DAY ), CASE_N( MEMBER_ID <= 0 AND IS_CRAWLER = 'Y', MEMBER_ID <= 0 AND IS_CRAWLER = 'N' AND PAGE_ID = 12345 , MEMBER_ID <= 0 AND IS_CRAWLER = 'N' AND PAGE_ID <> 12345 , -- catch_all_bucket for other PAGE_ID(s) MEMBER_ID > 0 AND PAGE_ID IN (12345 ,12355 ), MEMBER_ID > 0 AND PAGE_ID IN (20001 ,20010 ,20011 ), MEMBER_ID > 0 AND PAGE_ID IN (70001 ,70008 ), MEMBER_ID > 0 AND PAGE_ID IN (80001 ,80002 ), MEMBER_ID > 0 AND PAGE_ID IN (90001 ,90002 ), MEMBER_ID > 0 AND PAGE_ID IN (60001 ,60002 ), MEMBER_ID > 0 AND PAGE_ID IN (50001 ,50002 ), MEMBER_ID > 0 AND PAGE_ID not in (12345,12355,20001,......,50002) -- catch_all_bucket for other PAGE_ID(s) ), CASE_N(TRACKING_ID <= 0 , TRACKING_ID > 0 ) ); explain select * from FACT_WEB_CLICKS2 where member_id > 0 and tracking_id > 0 and page_id in (12345) and click_date = date '2013-02-05'; Explanation -------------------------------------------------------------------------- 1) First, we lock a distinct fact_web."pseudo table" for read on a RowHash to prevent global deadlock for SCRATCH_DS.FACT_WEB_CLICKS2. 2) Next, we lock fact_web.FACT_WEB_CLICKS2 for read. 3) We do an all-AMPs RETRIEVE step from a single partition of fact_web.FACT_WEB_CLICKS2 with a condition of ( "(fact_web.FACT_WEB_CLICKS2.PAGE_ID = 12345) AND ......
Now the query hit a single partition. So even though the first DDL achieve exactly the same data distribution among each partition as the second DDL, the following explicit criteria for the catch-all bucket will help optimizer to choose the optimal partition elimination:
- PAGE_ID <> 12345
- PAGE_ID not in (12345,12355,20001,……,50002)
Latin, UTF8 and UTF16 with Teradata
Recently I did some data movement between Teradata and HDFS, and felt it might be worth posting some insight about how to deal with Unicode in Teradata. For folks who only deal with ASCII strings, you can skip the whole post now 🙂
[Error 3577] [SQLState HY000] Row size or Sort Key size overflow.
When come to the topic of Unicode in Teradata, the above error is an inevitable subject. DBA might tell you that your row is bigger than 64K bytes. But that’s normally not the case. If the table contains quite some VARCHAR/CHAR(n) columns, here is how Teradata estimate the column size regardless if CHARACTER SET Latin or Unicode is specified in the table DDL :
Session CHARSET = | Estimated Column Size in Bytes for VARCHR(n)/CHAR(n) |
Latin / ASCII | n x 1 |
UTF8 | n x 3 |
UTF16 | n x 2 |
Then Teradata add the other columns byte size together to determine the row size used in the BTEQ/JDBC/ODBC/TPT/FEXP parcels. Even if the row is actually stored on the disk properly within the 64K limit, you still can’t retrieve it if the estimated row size is beyond 64K. This is quite annoying sometimes, because you can store the data but you can’t access it smoothly.
DBA might advise you to shrink the n in VARCHAR(n) or even vertically split the table into 2. But let’s try the following workaround before you restructure your table.
Set Session CharSet to ASCII
This will probably immediately resolve the row size overflow error. But all your French/Spanish/ Chinese/Japanese/Arabic characters will show up as “?”. This is not good!
Set Session CharSet to UTF16
Normally, people think UTF16 will require more space both on disk and over the network protocol, so UTF8 is always the best choice. But in Teradata’s case:
- Unicode strings are stored as UTF16 on disk anyway. Yes, space is wasted; that’s why there is an algorithm compression function to just compress UTF16 to UTF8 in version 13. Only version 14 can store UTF8 on disk.
- The Row Size estimation is 33% smaller in UTF16 mode than in UTF8 mode for VARCHAR/CHAR/CLOB columns.
- UTF16 mode will require more bytes over network transportation than UTF8
- UTF16 mode will occupy more disk space if you export the query output to a file (binary or vartext)
- but when UTF8 mode bombs, UTF16 may let your queries go through and display the proper characters in your SQL client or BI client
Therefore, UTF16 might be a better default choice for the follow clients:
- ODBC for SQL Assistant, Cognos, MicroStrategy, etc.
- JDBC for AquaData Studio, SQuirreL SQL Client, etc.
- CLIv2 for BTEQ -c utf16 -e utf8
For ETL tools, UTF16 provides better tolerance than UTF8 when non-BMP characters are encountered.
- When ETL writer (DML) session is using Teradata ODBC UTF8, the session will reject the records with extended characters (outside the Basic Multilingual Plane). The error message looks like “Failure 6705 An illegally formed character string was encountered during translation”
- The same extended characters will be converted to “�” using Teradata TPT UTF8, and the records can be written into database.
- Informatica’s implementation for Teradata ODBC has problem in UTF16 mode, please be very cautious.
BTEQ Example
Let’s see a few unicode characters, some of them are BMP, some of them are non-BMP. Stored in UTF8 mode as a Linux text file, some of them (line 3,4,7) are 3 bytes, others (line 1,2,5,6) are 4 bytes.
DDL: CREATE SET TABLE unicode_test ( ustring VARCHAR(40) CHARACTER SET UNICODE NOT CASESPECIFIC ) NO PRIMARY INDEX; $ cat insert8.sql insert into unicode_test values('1 [ 🐱 ]'); insert into unicode_test values('2 [ 🕓 ]'); insert into unicode_test values('3 [ ℃ ]'); insert into unicode_test values('4 [ ↳ ]'); insert into unicode_test values('5 [ 🏠 ]'); insert into unicode_test values('6 [ 🐎 ]'); insert into unicode_test values('7 [ 赞 ]'); $ bteq -c utf8 -e utf8 # only line 3,4,7 go through; line 1,2,5,6 are rejected # *** Failure 6705 An illegally formed character string was encountered during translation. #### UTF16 mode will load the data, but display them as diamond-shape ? #### $ bteq -c utf16 -e utf8 BTEQ -- Enter your SQL request or BTEQ command: run file insert8.sql; select * from unicode_test order by 1; *** Query completed. 7 rows found. One column returned. *** Total elapsed time was 1 second. ustring Char2HexInt(ustring) --------------------- ---------------------------------- 1 [ �� ] 00310020005B0020FFFDFFFD0020005D -- f0 9f 90 b1 (UTF8) 2 [ �� ] 00320020005B0020FFFDFFFD0020005D -- f0 9f 95 93 (UTF8) 3 [ ℃ ] 00330020005B0020210300200020005D -- e2 84 83 (UTF8) 4 [ ↳ ] 00340020005B002021B300200020005D -- e2 86 b3 (UTF8) 5 [ �� ] 00350020005B0020FFFDFFFD0020005D -- f0 9f 8f a0 (UTF8) 6 [ �� ] 00360020005B0020FFFDFFFD0020005D -- f0 9f 90 8e (UTF8) 7 [ 赞 ] 00370020005B00208D5E00200020005D -- e8 b5 9e (UTF8) ###################### without specifying any character set ################# $ bteq BTEQ -- Enter your SQL request or BTEQ command: *** Logon successfully completed. *** Teradata Database Release is 13.10.05.03 *** Teradata Database Version is 13.10.05.03a *** Transaction Semantics are BTET. *** Session Character Set Name is 'ASCII'. run file insert8.sql; select ustring, substr(char2hexint(ustring),1,40), octet_length(ustring) from scratch_ds.unicode_test order by 1; *** Query completed. 7 rows found. 2 columns returned. *** Total elapsed time was 1 second. ustring Substr(Char2HexInt(ustring),1,40) Octet_Length(ustring) ---------------------- ---------------------------------------- --------------------- 1 [ 🐱 ] 00310020005B002000D0009F009000B10020005D 10 2 [ 🕓 ] 00320020005B002000D0009F009500930020005D 10 3 [ ℃ ] 00330020005B002000E200D700B800200020005D 10 4 [ ↳ ] 00340020005B002000E2008600B300200020005D 10 5 [ 🏠 ] 00350020005B002000D0009F008F01610020005D 10 6 [ 🐎 ] 00360020005B002000D0009F0090008E0020005D 10 7 [ 赞 ] 00370020005B002000E800B5009E00200020005D 10 #### But if we query the same result from JDBC or ODBC client, the display is totally messed up #### ustring Char2HexInt(ustring) ---------- ------------------------------------------ 1 [ б ] 00310020005B002000D0009F009000B10020005D 2 [ Ð ] 00320020005B002000D0009F009500930020005D 3 [ â׸ ] 00330020005B002000E200D700B800200020005D 4 [ â³ ] 00340020005B002000E2008600B300200020005D 5 [ К ] 00350020005B002000D0009F008F01610020005D 6 [ Ð ] 00360020005B002000D0009F0090008E0020005D 7 [ èµ ] 00370020005B002000E800B5009E00200020005D
As you can see from the above sample, Teradata’s UTF16 mode still can’t store those 4-byte and 3-byte unicode beyond BMP, but at least the ETL process will go through. But ORACLE handles these sample characters flawlessly.
Non-BMP Character in HDFS
Those 4-byte non-BMP characters are also causing trouble on HDFS, it can be stored, but can’t be decoded properly at retrieval time. So it will be a good idea to filter them out at the input GUI.
Column Size Calculation in TPT/BTEQ/FEXP /FASTLOAD
When we export/import data to/from files (binary or flat), the following 2 data types need special calculation to define the file schema:
- VARCHAR(n)/CHAR(n):
- ASCII mode: VARCHAR/CHAR(n)
- UTF8 mode: VARCHAR/CHAR(n*3)
- UTF16 mode: VARCHAR/CHAR(n*2)
- DECIMAL(m,n):
- MaxDecimalDigits = 15: any column with precision > 15 will have to be defined as DECIMAL(15,x) in the file schema
- MaxDecimalDigits = 18: any column with precision > 18 will have to be defined as DECIMAL(18,x) in the file schema
- MaxDecimalDigits = 38: this is not the default setting, so we have to specify it in the script
Open Issue
If we use BTEQ or FASTLOAD to import flat file (or binary data) in ASCII mode without specifying -c UTF8 or -c UTF16, the utility does not conduct any character validation, so the non-BMP and symbol characters can be imported as is, but the string cannot be queried or displayed in JDBC or ODBC client.
With 14.00 and 14.10 available for upgrade, Teradata tries to advise customers to set AcceptReplacementCharacters = FALSE, but NTA2805 states that 14.xx still does not accept as many unicode characters as customers expect. So it seems that AcceptReplacementCharacters = TRUE is still required for a while.
Other Links
http://teradata.weizheng.net/2012/08/character-sets-in-teradata.html
http://hellohemanth.wordpress.com/2011/11/16/special-characters-issue-while-loading-from-oracle-and-sql-server-charcter-data/
Simple PERL script to remove the extra new line character from delimited flat files
When you use BTEQ, SQL*Plus, or Informatica to generate delimited flat file from RDBMS, the new line character embedded within a string/varchar column is quite annoying. You can add REPLACE() function to each string column in your SQL to remove these offending \n, but that’s still painful.
Let’s pipe the output of SQL into the following PERL script, and it will try its best to reconstruct the column by removing the extra new line characters. The assumption is – either the 1st row is in good shape which does not contain extra new line character, or the 1st row is the column header.
#!/usr/bin/perl -w my $file_name = shift @ARGV; my ($line_count, $line_fixed, $expected_delimiter_count, $count, $n, $del) = (0,0,0,0,0, ''); my @possible_delimiter = (',',"\t",';','\|', '\cA', '\cZ'); # open STDIN or file if (defined $file_name) { open INPUTPIPE, "<" , $file_name or die $!; } else { open INPUTPIPE, "<&STDIN"; } # count delimiter in the 1st line my $line_buffer = <INPUTPIPE>; $n++; print $line_buffer; $line_count++; # the 1st line must be in good shape foreach $del (@possible_delimiter){ chomp($line_buffer); $count = () = ($line_buffer =~ /$del/g); print STDERR $count . " ($del) delimiters found in 1st row.\n"; if ( $count > $expected_delimiter_count ){ $expected_delimiter_count = $count; $delimiter = $del; } } die "Can't detect delimiter!\n" if $expected_delimiter_count <= 0; print STDERR $expected_delimiter_count . " ($delimiter) delimiters expected.\n\n"; # read through the rest while( $line_buffer = <INPUTPIPE> ){ $n++; $count = () = ($line_buffer =~ /$delimiter/g); if ( $count >= $expected_delimiter_count ){ print $line_buffer; $line_count++; print STDERR "$n : $count : output \n"; next; } else { while( $next_line = <INPUTPIPE> ){ $n++; chomp($line_buffer); $line_buffer .= $next_line; $count = () = ($line_buffer =~ /$delimiter/g); # print STDERR "$n : $count : $line_buffer"; if ( $count >= $expected_delimiter_count ){ print $line_buffer; $line_count++; $line_fixed++; last; } } # read the next line } } # finish up print STDERR $line_count . " rows output into STDOUT and " . $line_fixed . " rows fixed.\n"; close INPUTPIPE;
Informatica Repository Query (part 2)
Posted by ES in Informatica on 2010/01/09
Continue to post a few more queries for Informatica PowerCenter Repository. Part 1 can be found at http://it.toolbox.com/blogs/golden-orbit-blog/dig-into-informatica-powercenter-repository-29027
How to find Teradata MLoad session against a particular table
select U.SUBJ_NAME, F.TASK_NAME WORKFLOW_NAME, T.Instance_name session_name, S.INSTANCE_NAME target_name, F.subject_id, F.TASK_ID WORKFLOW_ID, s.session_id, s.MAPPING_ID from OPB_TASK_INST T, OPB_TASK F, OPB_SUBJECT U, ( select session_id, sess_widg_inst_id, version_number, instance_name, mapping_id, rank() over (partition by session_id, sess_widg_inst_id order by version_number desc) sort_id from OPB_SWIDGET_INST where widget_type = 2 and UPPER(INSTANCE_NAME) LIKE '%MY_TABLE_NAME%' ) S where (s.session_id, s.sess_widg_inst_id, s.version_number) in ( select session_id, sess_widg_inst_id, max(VERSION_NUMBER) version_number from OPB_EXTN_ATTR e where object_type = 79 and object_subtype = 315000 -- MLoad Extension group by SESSION_ID, sess_widg_inst_id ) and s.SORT_ID = 1 -- only the current version and s.SESSION_ID = T.task_id and s.VERSION_NUMBER = T.VERSION_NUMBER and T.WORKFLOW_ID = F.TASK_ID and T.VERSION_NUMBER = F.VERSION_NUMBER and F.subject_id = U.SUBJ_ID order by 1,2,3; -- It will be nice to see workflow name in View Point update OPB_EXTN_ATTR set ATTR_VALUE = 'SubjectArea=$PMFolderName; Workflow=$PMWorkflowName; Session=$PMSessionName; WorkflowRunId=$PMWorkflowRunId;' where attr_id = 26 -- Query Band Expression and object_type = 79 -- Target and OBJECT_SUBTYPE = 315000 -- Teradata Plug-in and (SESSION_ID, SESS_WIDG_INST_ID, VERSION_NUMBER) in ( select session_id, sess_widg_inst_id, max(VERSION_NUMBER) version_number from OPB_EXTN_ATTR e where object_type = 79 and object_subtype = 315000 group by SESSION_ID, sess_widg_inst_id ); update OPB_EXTN_ATTR set ATTR_VALUE = '1' where attr_id = 10 -- Check "Drop Log/Error Tables" option and object_type = 79 -- Target and OBJECT_SUBTYPE = 315000 -- Teradata Plug-in and ATTR_VALUE = '0'; -- Option is not set yet
How to view all the expressions in a transformation
select wf.WIDGET_ID, wf.FIELD_ID, wf.FIELD_NAME, e.EXPR_ID, e.LINE_NO, e.EXPRESSION from OPB_EXPRESSION e, OPB_WIDGET_FIELD wf, OPB_WIDGET_EXPR we where wf.WIDGET_ID = e.WIDGET_ID and wf.WIDGET_ID = we.WIDGET_ID and wf.FIELD_ID = we.OUTPUT_FIELD_ID and we.EXPR_ID = e.EXPR_ID and wf.VERSION_NUMBER = we.VERSION_NUMBER and we.VERSION_NUMBER = e.VERSION_NUMBER and (wf.WIDGET_ID, wf.VERSION_NUMBER) = ( select w.widget_id, max(w.version_number) version_number --, w.widget_type, m.mapping_id from OPB_SUBJECT s, OPB_MAPPING m, OPB_WIDGET w, OPB_WIDGET_INST wi where m.subject_id = s.subj_id and m.mapping_id = wi.mapping_id and w.widget_id = wi.WIDGET_ID and w.subject_id = m.subject_id and s.SUBJ_NAME = 'folder_name1' and m.MAPPING_NAME = 'm_mapping_name1' and w.WIDGET_NAME = 'transformation_name1' group by w.widget_id )
Make sure “Insert Else Update” and “Update Else Insert” are checked for Dynamic Lookup
update OPB_WIDGET_ATTR set ATTR_VALUE = '1' where (WIDGET_ID, WIDGET_TYPE, VERSION_NUMBER) in ( select w.WIDGET_ID, w.WIDGET_TYPE, max(w.VERSION_NUMBER) VERSION_NUMBER from opb_widget w, opb_widget_attr wa where wa.ATTR_ID = 15 and wa.ATTR_VALUE = '1' -- Dynamic Lookup Cache and w.WIDGET_ID = wa.WIDGET_ID and w.WIDGET_TYPE = wa.WIDGET_TYPE and w.VERSION_NUMBER = wa.VERSION_NUMBER and w.WIDGET_TYPE = 11 -- Lookup group by w.WIDGET_ID, w.WIDGET_TYPE ) and ATTR_ID in (18,19) -- Insert Else Update, Update Else Insert and LINE_NO = 1 and ATTR_VALUE = '0'; select w.WIDGET_NAME, w.VERSION_ID, w.WIDGET_ID, wa.ATTR_ID, a.ATTR_NAME, wa.LINE_NO, wa.ATTR_VALUE from opb_widget w, opb_widget_attr wa, opb_attr a where w.WIDGET_NAME like ??? -- put your filter here and w.WIDGET_ID = wa.WIDGET_ID and w.WIDGET_TYPE = wa.WIDGET_TYPE and w.VERSION_NUMBER = wa.VERSION_NUMBER and wa.ATTR_ID = a.attr_id and wa.WIDGET_TYPE = a.OBJECT_TYPE_ID order by w.WIDGET_NAME, w.VERSION_NUMBER desc, wa.ATTR_ID, wa.LINE_NO;