IBM InfoSphere DataStage Interview Questions

Set L

Boost your career with IBM InfoSphere DataStage, a powerful ETL tool used for data integration, transformation, and data warehousing. Our platform offers a comprehensive collection of DataStage interview questions and exam preparation materials, covering everything from basic concepts to advanced topics. Whether you're a beginner or an experienced professional, explore real-world scenarios, practical questions, and expert-level insights to confidently prepare for interviews and certification exams.

DataStage Interview Questions

Question 01:

What is DB2 Connector in DataStage?
Answer:
The DB2 Connector is a native stage used to read from and write to IBM DB2 databases in parallel jobs. It is optimized for high performance and supports bulk operations, partitioning, and pushdown optimization.

Question 02:

What are the main features of DB2 Connector?
Answer:

High-speed data transfer
Bulk load support
Partitioned reads/writes
Pushdown optimization
Support for SQL queries

Question 03:

What is bulk load in DB2 Connector?
Answer:
Bulk load allows large volumes of data to be inserted quickly into DB2 tables using optimized database utilities instead of row-by-row inserts.

Question 04:

What is partitioned read in DB2 Connector?
Answer:
Data is read in parallel using multiple nodes based on partition keys.

Question 05:

What is partitioned write?
Answer:
Data is written in parallel into DB2 tables for faster performance.

Question 06:

What is Array Size in DB2 Connector?
Answer:
Defines the number of rows processed per batch during read/write.

Question 07:

What is isolation level in DB2?
Answer:
Defines how transactions interact (e.g., Read Uncommitted, Read Committed).

Question 08:

What is the difference between DB2 Connector and ODBC?
Answer:

DB2 Connector → native, faster
ODBC → generic, slower

Question 09:

What is write mode in DB2 Connector?
Answer:

Insert
Update
Delete
Merge

Question 10:

When should DB2 Connector be used?
Answer:
When working with DB2 databases for optimal performance.

🟣 Oracle Connector

Question 11:

What is Oracle Connector?
Answer:
A native stage used to connect and transfer data between DataStage and Oracle databases efficiently.

Question 12:

What are features of Oracle Connector?
Answer:

Bulk load (Direct Path Load)
Parallel processing
Partitioned read/write
Pushdown optimization

Question 13:

What is Direct Path Load?
Answer:
Fast method of inserting data directly into Oracle tables bypassing SQL layer.

Question 14:

What is OCI in Oracle Connector?
Answer:
Oracle Call Interface used for communication with Oracle DB.

Question 15:

What is partitioning in Oracle Connector?
Answer:
Parallel data distribution across nodes.

Question 16:

Difference between Oracle Connector and ODBC?
Answer:

Oracle Connector → optimized
ODBC → generic

Question 17:

What is SQL override?
Answer:
Custom SQL query written instead of default table access.

Question 18:

What is commit frequency?
Answer:
Number of rows after which transaction is committed.

Question 19:

What is reject link in Oracle Connector?
Answer:
Captures failed records during write.

Question 20:

When to use Oracle Connector?
Answer:
When working specifically with Oracle DB for better performance.

🟡 ODBC Stage

Question 21:

What is ODBC Stage?
Answer:
A generic stage used to connect DataStage with any database supporting ODBC drivers.

Question 22:

What is ODBC?
Answer:
Open Database Connectivity, a standard API for database access.

Question 23:

What are advantages of ODBC Stage?
Answer:

Supports multiple databases
Easy configuration

Question 24:

What are limitations of ODBC Stage?
Answer:

Slower performance
Limited optimization

Question 25:

When should ODBC be used?
Answer:
When native connectors are not available.

Question 26:

What is DSN in ODBC?
Answer:
Data Source Name used to configure database connection.

Question 27:

What is cursor type in ODBC?
Answer:
Defines how data is fetched (forward-only, scrollable).

Question 28:

What is transaction handling in ODBC?
Answer:
Manages commit and rollback.

Question 29:

Difference between ODBC and JDBC?
Answer:

ODBC → native driver
JDBC → Java-based

Question 30:

Is ODBC parallel?
Answer:
Limited parallelism.

🔴 Writing SQL in DataStage

Question 31:

How to write SQL in DataStage?
Answer:
SQL queries can be written in:

DB Connector stages
ODBC stage
Transformer (limited use)

Question 32:

What is SQL Override?
Answer:
Custom SQL query replacing default table read/write.

Question 33:

Can we use joins in SQL override?
Answer:
Yes.

Question 34:

Can we use WHERE clause?
Answer:
Yes, for filtering data.

Question 35:

Can we use GROUP BY?
Answer:
Yes, for aggregation.

Question 36:

What is benefit of SQL in DataStage?
Answer:
Reduces data movement and improves performance.

Question 37:

What is parameterized SQL?
Answer:
Using job parameters in SQL queries.

Question 38:

What is pre-SQL and post-SQL?
Answer:

Pre-SQL → runs before job
Post-SQL → runs after job

Question 39:

Can we call stored procedures?
Answer:
Yes.

Question 40:

What is SQL pushdown?
Answer:
Executing logic inside database instead of DataStage.

⚡ Pushdown Optimization

Question 41:

What is Pushdown Optimization?
Answer:
Technique where processing is pushed to database instead of DataStage engine.

Question 42:

Why use pushdown optimization?
Answer:

Improves performance
Reduces data transfer
Uses DB power

Question 43:

What operations can be pushed down?
Answer:

Join
Filter
Aggregation
Sort

Question 44:

Difference between full and partial pushdown?
Answer:

Full → entire logic in DB
Partial → some logic in DB

Question 45:

What are requirements for pushdown?
Answer:

Supported connector
Compatible SQL
Proper configuration

Question 46:

Limitations of pushdown?
Answer:

DB dependency
Limited flexibility

Question 47:

How to enable pushdown?
Answer:
Configure in connector stage settings.

Question 48:

When not to use pushdown?
Answer:
When transformation logic is complex or DB load is high.

Question 49:

Example of pushdown optimization?
Answer:
Performing aggregation in DB instead of Aggregator Stage.

Question 50:

Best practices for Database Stages?
Answer:

Use native connectors
Use bulk load for large data
Optimize SQL queries
Use pushdown when possible
Avoid ODBC if better option exists
Monitor DB performance

IBM InfoSphere DataStage Interview Questions - Set L