IBM InfoSphere DataStage Interview Questions
Set L
Boost your career with IBM InfoSphere DataStage, a powerful ETL tool used for data integration, transformation, and data warehousing. Our platform offers a comprehensive collection of DataStage interview questions and exam preparation materials, covering everything from basic concepts to advanced topics. Whether you're a beginner or an experienced professional, explore real-world scenarios, practical questions, and expert-level insights to confidently prepare for interviews and certification exams.
DataStage Interview Questions
Question 01:
What is DB2 Connector in DataStage?
Answer:
The DB2 Connector is a native stage used to read from and write to IBM DB2 databases in parallel jobs. It is optimized for high performance and supports bulk operations, partitioning, and pushdown optimization.
Question 02:
What are the main features of DB2 Connector?
Answer:
- High-speed data transfer
- Bulk load support
- Partitioned reads/writes
- Pushdown optimization
- Support for SQL queries
Question 03:
What is bulk load in DB2 Connector?
Answer:
Bulk load allows large volumes of data to be inserted quickly into DB2 tables using optimized database utilities instead of row-by-row inserts.
Question 04:
What is partitioned read in DB2 Connector?
Answer:
Data is read in parallel using multiple nodes based on partition keys.
Question 05:
What is partitioned write?
Answer:
Data is written in parallel into DB2 tables for faster performance.
Question 06:
What is Array Size in DB2 Connector?
Answer:
Defines the number of rows processed per batch during read/write.
Question 07:
What is isolation level in DB2?
Answer:
Defines how transactions interact (e.g., Read Uncommitted, Read Committed).
Question 08:
What is the difference between DB2 Connector and ODBC?
Answer:
- DB2 Connector → native, faster
- ODBC → generic, slower
Question 09:
What is write mode in DB2 Connector?
Answer:
- Insert
- Update
- Delete
- Merge
Question 10:
When should DB2 Connector be used?
Answer:
When working with DB2 databases for optimal performance.
🟣 Oracle Connector
Question 11:
What is Oracle Connector?
Answer:
A native stage used to connect and transfer data between DataStage and Oracle databases efficiently.
Question 12:
What are features of Oracle Connector?
Answer:
- Bulk load (Direct Path Load)
- Parallel processing
- Partitioned read/write
- Pushdown optimization
Question 13:
What is Direct Path Load?
Answer:
Fast method of inserting data directly into Oracle tables bypassing SQL layer.
Question 14:
What is OCI in Oracle Connector?
Answer:
Oracle Call Interface used for communication with Oracle DB.
Question 15:
What is partitioning in Oracle Connector?
Answer:
Parallel data distribution across nodes.
Question 16:
Difference between Oracle Connector and ODBC?
Answer:
- Oracle Connector → optimized
- ODBC → generic
Question 17:
What is SQL override?
Answer:
Custom SQL query written instead of default table access.
Question 18:
What is commit frequency?
Answer:
Number of rows after which transaction is committed.
Question 19:
What is reject link in Oracle Connector?
Answer:
Captures failed records during write.
Question 20:
When to use Oracle Connector?
Answer:
When working specifically with Oracle DB for better performance.
🟡 ODBC Stage
Question 21:
What is ODBC Stage?
Answer:
A generic stage used to connect DataStage with any database supporting ODBC drivers.
Question 22:
What is ODBC?
Answer:
Open Database Connectivity, a standard API for database access.
Question 23:
What are advantages of ODBC Stage?
Answer:
- Supports multiple databases
- Easy configuration
Question 24:
What are limitations of ODBC Stage?
Answer:
- Slower performance
- Limited optimization
Question 25:
When should ODBC be used?
Answer:
When native connectors are not available.
Question 26:
What is DSN in ODBC?
Answer:
Data Source Name used to configure database connection.
Question 27:
What is cursor type in ODBC?
Answer:
Defines how data is fetched (forward-only, scrollable).
Question 28:
What is transaction handling in ODBC?
Answer:
Manages commit and rollback.
Question 29:
Difference between ODBC and JDBC?
Answer:
- ODBC → native driver
- JDBC → Java-based
Question 30:
Is ODBC parallel?
Answer:
Limited parallelism.
🔴 Writing SQL in DataStage
Question 31:
How to write SQL in DataStage?
Answer:
SQL queries can be written in:
- DB Connector stages
- ODBC stage
- Transformer (limited use)
Question 32:
What is SQL Override?
Answer:
Custom SQL query replacing default table read/write.
Question 33:
Can we use joins in SQL override?
Answer:
Yes.
Question 34:
Can we use WHERE clause?
Answer:
Yes, for filtering data.
Question 35:
Can we use GROUP BY?
Answer:
Yes, for aggregation.
Question 36:
What is benefit of SQL in DataStage?
Answer:
Reduces data movement and improves performance.
Question 37:
What is parameterized SQL?
Answer:
Using job parameters in SQL queries.
Question 38:
What is pre-SQL and post-SQL?
Answer:
- Pre-SQL → runs before job
- Post-SQL → runs after job
Question 39:
Can we call stored procedures?
Answer:
Yes.
Question 40:
What is SQL pushdown?
Answer:
Executing logic inside database instead of DataStage.
⚡ Pushdown Optimization
Question 41:
What is Pushdown Optimization?
Answer:
Technique where processing is pushed to database instead of DataStage engine.
Question 42:
Why use pushdown optimization?
Answer:
- Improves performance
- Reduces data transfer
- Uses DB power
Question 43:
What operations can be pushed down?
Answer:
- Join
- Filter
- Aggregation
- Sort
Question 44:
Difference between full and partial pushdown?
Answer:
- Full → entire logic in DB
- Partial → some logic in DB
Question 45:
What are requirements for pushdown?
Answer:
- Supported connector
- Compatible SQL
- Proper configuration
Question 46:
Limitations of pushdown?
Answer:
- DB dependency
- Limited flexibility
Question 47:
How to enable pushdown?
Answer:
Configure in connector stage settings.
Question 48:
When not to use pushdown?
Answer:
When transformation logic is complex or DB load is high.
Question 49:
Example of pushdown optimization?
Answer:
Performing aggregation in DB instead of Aggregator Stage.
Question 50:
Best practices for Database Stages?
Answer:
- Use native connectors
- Use bulk load for large data
- Optimize SQL queries
- Use pushdown when possible
- Avoid ODBC if better option exists
- Monitor DB performance
