EXPDP IMPDP Data Pump Interview Questions | CloudDBA

Interview Preparation

EXPDP / IMPDP Data Pump Interview Questions

In-depth Q&A covering Oracle Data Pump architecture, performance tuning, cross-version exports, parallel processing, new 12c features, and real-world scenarios. Essential preparation for any Oracle DBA interview.

Data Pump EXPDP IMPDP Parallel Cross-version 12c Features Performance MCP / Workers

Fundamentals & Architecture

questions 1–21

Q1 What are the new Data Pump features introduced in Oracle 12c? ▶

Oracle 12c introduced several significant Data Pump enhancements:

VIEWS_AS_TABLES: Export a view’s result set as a table in the target database — useful for migrating complex view logic into a staging table.
DISABLE_ARCHIVE_LOGGING transform: Suppress redo generation during import for tables and/or indexes, dramatically speeding up large imports (requires NOARCHIVELOG mode or explicit bypass of FORCE LOGGING).
Transportable Tablespace full database export: Export an entire database using transportable tablespace mode for faster migration.
Pluggable Database (PDB) support: Export/import within and between PDBs and CDBs.
SecureFile LOB support improvements: Better handling of encrypted and deduplicated LOBs.
LOGTIME parameter: Prefix each log line with a timestamp for easier performance analysis.
Legacy mode parameter conversion: Oracle automatically maps old exp/imp parameters to their Data Pump equivalents.

Q2 When do you need to use the FLASHBACK_TIME or FLASHBACK_SCN parameter? ▶

Use FLASHBACK_SCN or FLASHBACK_TIME when you need a consistent point-in-time snapshot of the data across multiple tables or schemas. Without these parameters, each table is exported at a slightly different point in time — perfectly acceptable for backup, but problematic when the export must be transactionally consistent.

Common use cases:

Golden Gate initial load: The capture SCN must match the export SCN exactly. Use FLASHBACK_SCN equal to the Golden Gate capture point so GG can replay transactions from there without gaps or duplicates.
Cross-schema consistent exports: When tables have foreign key relationships across schemas, a consistent SCN ensures referential integrity in the exported data.
Point-in-time migration: To capture the database state at a known moment, for example just before a major release deployment.

expdp system/*** full=y flashback_scn=1234567 dumpfile=full_consistent.dmp \
  directory=dp_dir logfile=exp.log

Q3 During IMPDP with FULL=Y, do you need to pre-create users? ▶

No. When importing with FULL=Y, Data Pump automatically creates all users (schemas) found in the dump file, along with their default tablespace assignments, quotas, and granted roles. You only need to pre-create users when importing with SCHEMAS= and you want the user to exist with specific attributes before the import begins — or when using REMAP_SCHEMA to map objects to an existing user.

The target tablespaces referenced by the dump file must already exist, or you must use REMAP_TABLESPACE to redirect segments to existing tablespaces.

Q4 Which memory area do EXPDP and IMPDP primarily use? ▶

Data Pump primarily uses the Streams Pool (part of the SGA, configured by STREAMS_POOL_SIZE). The Streams Pool provides the buffer and queue infrastructure that Data Pump uses for its advanced queue-based communication between the master control process (MCP), worker processes, and the external agent.

If STREAMS_POOL_SIZE is not explicitly set, Oracle auto-tunes it from the shared pool. For large parallel Data Pump jobs, setting it explicitly (e.g. 256 MB to 1 GB) can prevent ORA-04031 errors and improve throughput.

Q5 How can you improve the performance of a Data Pump export or import? ▶

Key performance levers:

PARALLEL: Set PARALLEL=N to use multiple worker processes. Rule of thumb: 2× the number of CPU cores, but test — too many workers can cause I/O contention.
Multiple dump files: Match the number of dump files to the PARALLEL value: DUMPFILE=exp_%U.dmp (the %U generates sequentially numbered files).
STREAMS_POOL_SIZE: Size the Streams Pool explicitly to avoid auto-tuning overhead.
DISABLE_ARCHIVE_LOGGING (12c+ import): Suppress redo for tables and indexes. Dramatically reduces import time on large datasets.
Exclude indexes/constraints during import: Import data first, then create indexes separately with parallelism using a SQL script generated via SQLFILE.
COMPRESSION=ALL: Reduces dump file I/O, often improving performance when I/O is the bottleneck.
Network link import: Skip the dump file entirely — import directly from source database via a database link using NETWORK_LINK.
Use fast storage: Place dump files on high-throughput storage or ASM. Avoid NFS for very large exports.

Q7 What is the difference between traditional EXP/IMP and Data Pump EXPDP/IMPDP? ▶

Feature	Traditional EXP/IMP	Data Pump EXPDP/IMPDP
Dump file structure	Single proprietary binary file	Set of files (dump file set)
File access	Client or server side	Server side only (via Oracle Directory objects)
Metadata format	DDL statements embedded in dump	XML document format
Parallelism	Single stream only	Multiple parallel worker processes
Job control	None — kill the process to stop	Attach, detach, suspend, resume, kill via interactive mode
Tape support	Yes (sequential media)	No
Performance	Slower (client-side processing)	Significantly faster (server-side, parallel)
Network import	No	Yes, via NETWORK_LINK (no dump file needed)
Restart capability	No	Yes — jobs can be suspended and resumed

Q17 What processes are involved in a Data Pump job internally? ▶

A Data Pump job involves four distinct process types:

Client process: Initiated when you run expdp or impdp from the OS prompt. It invokes the Data Pump API and must stay running for the job (unless you detach and let it run in the background). If the client disconnects, the job continues but you cannot monitor it without reattaching.
Shadow process (foreground server process): Created when the client logs in to the database. It handles the initial API calls, creates the Master Table (a staging table in the job owner’s schema), and sets up the Advanced Queue channels used for internal communication. It persists for the duration of the client session.
Master Control Process (MCP): One MCP per Data Pump job. It orchestrates the entire job — dividing work into metadata and data tasks, distributing them to worker processes, tracking progress, and managing restarts. It reads and updates the Master Table.
Worker processes: Created by the MCP, one per unit of parallelism (controlled by the PARALLEL parameter). Workers perform the actual data and metadata extraction or loading. For parallel jobs, multiple workers can operate on different segments simultaneously.

The Master Table is key to job restartability — it records completed work so that if the job is interrupted, it can resume from the last checkpoint rather than starting over.

Q19 What is the VIEWS_AS_TABLES parameter? ▶

Introduced in Oracle 12c, VIEWS_AS_TABLES allows you to export the result set of a view as if it were a table. The target database receives the data as a new physical table, not the view definition.

expdp system/*** schemas=hr views_as_tables=hr.v_employee_summary \
  dumpfile=view_export.dmp directory=dp_dir

Use cases include materialising a complex view during a migration, creating a denormalised reporting copy, or transferring data from a source where you do not have access to the underlying tables but do have SELECT on the view.

Q20 In which cases must you use traditional EXP instead of EXPDP? ▶

Traditional exp is still required in these scenarios:

Exporting to tape (sequential media): Data Pump does not support sequential devices. Traditional exp can pipe output directly to tape.
Exporting from very old source databases: Data Pump was introduced in Oracle 10g. To export from Oracle 8i or 9i, you must use traditional exp.
Client-side file creation: If you need the dump file created on the client machine (not the server), traditional exp supports this. Data Pump always writes to server-side Oracle directories.
Certain legacy object types: In some edge cases, older object types or corrupt objects that traditional exp handles gracefully may cause Data Pump to error. Traditional exp has more lenient handling of some legacy metadata.

Q21 Does IMPDP create indexes and constraints automatically? What is the process? ▶

Yes, by default IMPDP creates indexes and constraints — but it does so after all data is loaded, not before. The default import order is:

Tablespaces and users (schema objects)
Tables (structure only)
Table data (rows)
Indexes
Constraints (primary key, unique, check, foreign key)
Triggers
Other dependent objects (grants, comments, statistics)

For large imports, you can exclude indexes and constraints during the initial load using EXCLUDE=INDEX,CONSTRAINT, load the data, then create them separately with parallelism. This is often significantly faster as the optimizer can build indexes on bulk-loaded data more efficiently than maintaining them row-by-row.

-- Step 1: Import data only, skip indexes and constraints
impdp system/*** directory=dp_dir dumpfile=schema.dmp \
  schemas=hr exclude=index,constraint

-- Step 2: Generate index/constraint DDL
impdp system/*** directory=dp_dir dumpfile=schema.dmp \
  schemas=hr include=index,constraint sqlfile=create_indexes.sql

-- Step 3: Run the SQL file with parallelism added
sqlplus system/*** @create_indexes.sql

Cross-version & Compatibility

questions 22–34

Q22 How do you determine the export mode (schema/table/full) from a dump file alone? ▶

Use the IMPDP SQLFILE parameter with FULL=Y — this generates the DDL that would be executed without actually importing anything. The output reveals what object types and schemas are present:

impdp system/*** dumpfile=unknown.dmp directory=dp_dir \
  sqlfile=check_contents.sql full=y

Alternatively, query the Master Table if the original export job is still accessible in the source database:

SELECT * FROM sys_export_schema_01;  -- or the job-specific master table name

The dump file header itself also encodes the export mode — you can read it using impdp ... NOLOGFILE=Y SQLFILE=metadata.sql and inspect the resulting file.

Q23 How do you import only specific schemas and exclude all others? ▶

Use the SCHEMAS parameter on import, even when the dump file was a full export:

-- Import only HR and SCOTT from a full dump
impdp system/*** dumpfile=full_export.dmp directory=dp_dir \
  schemas=hr,scott logfile=import_selective.log

To also remap schemas to different target users:

impdp system/*** dumpfile=full_export.dmp directory=dp_dir \
  schemas=hr remap_schema=hr:hr_new logfile=import_remap.log

When using SCHEMAS= against a full dump, Data Pump filters out all other schemas automatically. Only the metadata and data for the specified schemas is processed.

Q24 Is Data Pump platform-dependent? ▶

Data Pump dump files are platform-independent in terms of endianness — you can create a dump on Linux x86-64 and import it on Solaris SPARC, for example, without conversion. This is a key advantage over transportable tablespaces, which require platform endianness to match (or explicit conversion).

However, the dump file format is version-dependent: a dump created by a higher version of Oracle cannot be imported into a lower version without using the VERSION parameter during export (see Q30).

Q25 Describe a scenario where you need both Data Pump and RMAN. ▶

A classic scenario is a database migration with minimal downtime:

RMAN: Take a full backup of the source database and restore it to the target server. This handles the bulk of the data movement offline and is fast for large databases.
Data Pump (incremental): While RMAN restore completes on the target, use Data Pump to export only the objects that changed since the RMAN backup (using FLASHBACK_SCN or QUERY filtering) and import them on the target — synchronising the delta without another full copy.

Another scenario: after a point-in-time recovery (RMAN), use Data Pump to export specific tables that should be at an even earlier point in time from a backup control file — selectively restoring individual objects without a full database rollback.

Q26 What is the CONSISTENT option in traditional EXP, and what is its Data Pump equivalent? ▶

In traditional exp, CONSISTENT=Y exports all tables at the same point in time using a read-consistent snapshot via the SET TRANSACTION READ ONLY statement.

In Data Pump, the equivalent is FLASHBACK_SCN or FLASHBACK_TIME. There is no CONSISTENT parameter in Data Pump — you must explicitly specify an SCN or timestamp:

-- Consistent export at current SCN
expdp system/*** schemas=hr \
  flashback_time="to_timestamp(sysdate)" \
  dumpfile=hr_consistent.dmp directory=dp_dir

Q27 Why is Data Pump faster than traditional export/import? ▶

Several architectural advantages make Data Pump substantially faster:

Server-side execution: All processing happens inside the database server, eliminating the client-side network overhead of traditional exp/imp.
Parallel worker processes: Multiple workers extract or load data concurrently. Traditional exp/imp is strictly single-threaded.
Direct path: Data Pump uses direct path reads and writes, bypassing the buffer cache for large table operations — similar to SQL*Loader direct path.
Advanced Queues for metadata: XML-based metadata transfer through Oracle’s internal queue infrastructure is faster than the DDL text processing of traditional exp.
Efficient block-level data movement: Data is moved in database blocks, not row-by-row through SQL cursors.

Q28 What is the order in which objects are imported by IMPDP? ▶

Data Pump imports objects in dependency order to ensure that prerequisites are created before dependents. The general sequence is:

Tablespaces
Profiles
Users and roles
System grants and role grants
Object grants
Table definitions (structure)
Table data (rows)
Table partitions
Indexes (non-constraint)
Constraints (primary key, unique, foreign key, check)
Views, procedures, functions, packages
Triggers
Object type bodies
Comments and statistics

This ordering is why excluding indexes and constraints during import (then re-creating them afterward with parallelism) is such an effective performance optimisation — you avoid the overhead of maintaining them incrementally during the data load.

Q29 How do you import only metadata (structure), without data? ▶

impdp system/*** dumpfile=full.dmp directory=dp_dir \
  content=metadata_only schemas=hr logfile=metadata_import.log

The CONTENT=METADATA_ONLY parameter tells Data Pump to create all schema objects (tables, indexes, constraints, grants, etc.) but skip all row data. This is useful for:

Creating a structural copy of a schema for development or testing (then loading with synthetic data).
Comparing schema structures between environments.
Pre-creating objects before a subsequent data-only import.

The corresponding export parameter is also CONTENT=METADATA_ONLY.

Q30 What is the significance of the VERSION parameter for cross-release exports? ▶

The VERSION parameter in expdp controls the format of the dump file, allowing it to be imported by an older version of Oracle. By default, the dump uses the current database version’s format — which an older database cannot read.

Key rule: Data Pump checks only the major version number (9, 10, 11, 12, 19…), not the specific patch release (10.1, 10.2, 11.1, 11.2). So VERSION=10.2 and VERSION=10.1 produce equivalent compatibility.

-- Export from 12c in a format importable by 11g
expdp system/*** schemas=hr dumpfile=hr_for_11g.dmp \
  directory=dp_dir version=11.2

Important: Features or object types introduced after the target version will be silently excluded from the export. For example, exporting with VERSION=11.2 will omit any 12c-specific constructs (identity columns, invisible columns, etc.).

You cannot import a dump into a version older than the VERSION parameter used during export. For example, a dump created with VERSION=11.2 cannot be imported into 10g — the major version check will fail.

Q34 Using traditional EXP/IMP, how do you export from 11g and import into 10g or 9i? ▶

For traditional exp/imp, use the lower version’s client binary to perform the export from the higher version database. Oracle’s traditional export client is backward-compatible — the 10g exp client can connect to an 11g database and produce a dump file that the 10g imp can read.

# Use the 10g exp client to export from an 11g database
$ORACLE_10G_HOME/bin/exp userid=system/*** \
  file=export_10g_compat.dmp log=exp.log full=y

# Import using 10g imp
$ORACLE_10G_HOME/bin/imp userid=system/*** \
  file=export_10g_compat.dmp log=imp.log full=y

This method does not apply to Data Pump. For Data Pump cross-version exports, use the VERSION parameter (Q30).

Advanced Operations & Tuning

questions 35–61

Q35 Does Data Pump address table or index corruption? ▶

No. Data Pump is a logical export/import tool — it reads data through SQL and writes it through SQL. It will fail on corrupt blocks (ORA-01578) rather than skipping them. If a table has block corruption, Data Pump cannot export it cleanly.

To work around this:

Use RMAN RECOVER ... BLOCK CORRUPTION to repair physical corruption before exporting.
Use DBMS_REPAIR to mark and skip corrupt blocks (data loss accepted), then export.
Use the traditional exp with CONSISTENT=N — in some cases it handles minor corruption more gracefully than Data Pump.

Q36 What happens when legacy EXP parameters are used with EXPDP? ▶

Oracle automatically enters Legacy Mode and converts the old parameter to its Data Pump equivalent, logging the substitution. For example:

expdp system/*** rows=n dumpfile=ddl.dmp

Output shows the automatic conversion:

Legacy Mode Active due to the following parameters:
  Legacy Mode Parameter: "rows=FALSE" Location: Command Line
  Replaced with: "content=metadata_only"
Legacy Mode has set reuse_dumpfiles=true parameter.
Legacy Mode has set nologfile=true parameter.

Common legacy-to-Data Pump parameter mappings:

Legacy (EXP)	Data Pump (EXPDP)
`ROWS=N`	`CONTENT=METADATA_ONLY`
`CONSISTENT=Y`	`FLASHBACK_TIME / FLASHBACK_SCN`
`COMPRESS=Y`	`COMPRESSION=ALL`
`OWNER=`	`SCHEMAS=`
`FROMUSER= / TOUSER=`	`REMAP_SCHEMA=`
`IGNORE=Y`	`TABLE_EXISTS_ACTION=APPEND`

Q37 Which process is responsible for writing data into the dump file? ▶

The worker processes are responsible for reading data from the database segments and writing it to the dump file set. Each worker process handles one or more objects (tables, schemas) assigned to it by the Master Control Process (MCP). In a parallel export, multiple workers write to separate dump files simultaneously — which is why you must provide multiple file names (or use the %U substitution variable) when using PARALLEL > 1.

Q38 How do you optimise Streams Pool usage to tune Data Pump performance? ▶

Data Pump’s internal Advanced Queue messaging relies on the Streams Pool. If this pool is undersized, Oracle steals memory from the shared pool, causing contention and ORA-04031 errors.

-- Set explicitly for large parallel Data Pump jobs
ALTER SYSTEM SET STREAMS_POOL_SIZE = 512M SCOPE=BOTH;

Sizing guidance:

For small/medium jobs (PARALLEL ≤ 4): 128–256 MB is usually sufficient.
For large parallel jobs (PARALLEL 8–16+): 512 MB to 1 GB.
Monitor V$STREAMS_POOL_ADVICE to see current utilisation and projected needs.

Also ensure SGA_TARGET is large enough to accommodate the Streams Pool alongside other SGA components.

Q39 How do you operate a Data Pump job from a client machine? ▶

Run expdp or impdp from any Oracle client installation that has network connectivity to the target database. The Data Pump client utility connects to the database over SQL*Net — all processing happens on the server side:

expdp system/***@targetdb_tns schemas=hr \
  dumpfile=hr_export.dmp directory=dp_dir logfile=hr_exp.log

The Oracle Directory (dp_dir) must exist on the server, not the client. The dump file is written to the server filesystem path that the directory object points to.

If you need the dump file on the client machine, consider using a network share mounted on the server, or transfer the file after export using SCP/SFTP.

Q40 Can you run a Data Pump export from a physical standby without affecting the primary? ▶

Yes, from Oracle 11g onward you can run Data Pump exports directly from an Active Data Guard physical standby (which must be open in READ ONLY + REDO APPLY mode). The export reads from the standby and offloads the I/O burden from the primary entirely.

-- Connect and export from the standby
expdp system/***@standby_tns schemas=hr \
  dumpfile=hr_standby.dmp directory=dp_dir_standby logfile=exp.log

Requirements:

Active Data Guard license (READ ONLY + APPLY mode).
An Oracle Directory on the standby server filesystem.
The export runs as of the standby’s current applied SCN — use FLASHBACK_SCN if you need an exact SCN match with the primary.

Q41 How do you schedule a Data Pump job using DBMS_SCHEDULER? ▶

BEGIN
  DBMS_SCHEDULER.CREATE_JOB(
    job_name        => 'NIGHTLY_HR_EXPORT',
    job_type        => 'EXECUTABLE',
    job_action      => '/u01/app/oracle/scripts/hr_export.sh',
    start_date      => SYSTIMESTAMP,
    repeat_interval => 'FREQ=DAILY; BYHOUR=2; BYMINUTE=0',
    enabled         => TRUE,
    comments        => 'Nightly HR schema Data Pump export'
  );
END;
/

The shell script hr_export.sh contains the expdp command. Alternatively, use DBMS_DATAPUMP API directly inside a PL/SQL scheduler job for finer programmatic control — this avoids the need for a shell script:

-- Using DBMS_DATAPUMP API directly in a scheduler job
job_type => 'PLSQL_BLOCK',
job_action => '
  DECLARE
    h NUMBER;
  BEGIN
    h := DBMS_DATAPUMP.OPEN(''EXPORT'',''SCHEMA'',NULL,''HR_NIGHTLY'');
    DBMS_DATAPUMP.ADD_FILE(h, ''hr_nightly.dmp'', ''DP_DIR'');
    DBMS_DATAPUMP.METADATA_FILTER(h, ''SCHEMA_EXPR'', ''IN (''''HR'''')'');
    DBMS_DATAPUMP.START_JOB(h);
  END;'

Q42 Why is a Data Pump dump file significantly larger when encryption is enabled? ▶

Encrypted dump files are larger because:

Encryption disables compression of the encrypted content: When both ENCRYPTION and COMPRESSION are specified, compression must happen before encryption (compressed data is encrypted). However, if only ENCRYPTION is set without COMPRESSION, the raw uncompressed data is encrypted — and encrypted data does not compress well, so subsequent compression yields no benefit.
Block padding: AES encryption (the default) pads data to block boundaries, adding overhead.
Metadata overhead: Encryption headers, key metadata, and integrity checksums add to the file size.

Best practice: always pair ENCRYPTION=ALL with COMPRESSION=ALL to get compression before encryption:

expdp system/*** schemas=hr dumpfile=hr_enc.dmp directory=dp_dir \
  compression=all encryption=all encryption_password=MySecretPwd

Q43 In which circumstances should you use SQL*Loader instead of Data Pump? ▶

Use SQL*Loader (sqlldr) when:

Loading from flat files: CSV, fixed-width, delimited text files from external systems, ETL pipelines, or third-party applications. Data Pump requires an Oracle dump file.
Transforming data during load: SQL*Loader supports field-level transformations, conditional loading, and complex mapping logic in the control file.
Loading into a subset of columns: Map source file columns to specific database columns, skipping or defaulting others.
Loading from non-Oracle sources: Any time the source data is not already in an Oracle database.
Very high-speed direct path loads: SQL*Loader’s direct path bypass mode can be faster than Data Pump for initial data loads with no existing data in the target table.

Q44 What is the maximum parallelism for Data Pump? ▶

There is no hard Oracle-imposed maximum for the PARALLEL parameter, but practical limits apply:

CPU cores: Setting PARALLEL higher than 2× the number of available CPU cores yields diminishing returns and can cause CPU saturation.
Dump files: You must provide at least as many dump file names as the PARALLEL value — workers cannot share a single file. Use DUMPFILE=exp_%U.dmp for automatic sequencing.
I/O throughput: If storage is the bottleneck, adding workers just queues them at the I/O layer.
Table count: Only one worker per table — if you have 4 workers but only 2 large tables, the other 2 workers sit idle for those tables.

A rule of thumb: start with PARALLEL = number of CPUs, then increase to 2× CPUs if I/O utilisation is below 80% and CPU is not saturated.

Q45 What role or privilege must a user have to run Data Pump jobs? ▶

For full Data Pump access (any schema, any table), the user needs the EXP_FULL_DATABASE role for exports and IMP_FULL_DATABASE role for imports. These roles are granted to DBA by default.

For self-schema operations only (export/import your own objects), no special Data Pump role is needed — standard CREATE SESSION and object privileges are sufficient.

-- Grant full Data Pump privileges
GRANT EXP_FULL_DATABASE TO dp_user;
GRANT IMP_FULL_DATABASE TO dp_user;

-- Also grant read/write on the directory
GRANT READ, WRITE ON DIRECTORY dp_dir TO dp_user;

The DATAPUMP_EXP_FULL_DATABASE and DATAPUMP_IMP_FULL_DATABASE roles (synonyms for the above) were introduced for clarity in later releases. The directory read/write grant is always required separately.

Q46 Can a Data Pump job be resumed after a server reboot? ▶

Yes — this is one of Data Pump’s most powerful features. The Master Table persists in the database even across reboots. After the server restarts, reattach to the job using its job name:

-- List existing Data Pump jobs
SELECT owner_name, job_name, state, degree
FROM dba_datapump_jobs
WHERE state != 'NOT RUNNING';

-- Reattach to a stopped job
impdp system/*** attach=hr.sys_import_schema_01

-- Once attached, start the job again
Import> START_JOB

The job resumes from where it was interrupted — objects already processed are not re-imported. This makes Data Pump safe for very long-running migrations where server maintenance windows may intervene.

Q48 You received a dump file to import but don’t know the source schema. How do you find it? ▶

Use IMPDP with SQLFILE to extract the DDL without actually importing anything. The generated SQL file reveals all schema names and object types:

impdp system/*** dumpfile=unknown.dmp directory=dp_dir \
  full=y sqlfile=check_ddl.sql nologfile=y

Then inspect the SQL file:

grep -i "CREATE USER\|CREATE SCHEMA\|ALTER USER" check_ddl.sql | head -30

Alternatively, look at the dump file header using strings on Unix — the schema names are usually visible in the header metadata near the beginning of the file.

Q50 You see undo tablespace errors during a large import. How do you avoid them? ▶

Undo errors (ORA-30036: unable to extend segment in undo tablespace) during import occur because large imports generate substantial undo — especially for large tables with many rows or LOBs.

Solutions:

Increase undo tablespace size or enable autoextend before starting the import.
Increase UNDO_RETENTION to give undo more breathing room.
Import in smaller batches using the QUERY parameter to filter rows, or TABLE_EXISTS_ACTION=APPEND with partitioned imports.
Use DISABLE_ARCHIVE_LOGGING (12c): reduces redo and indirectly reduces some undo pressure.
Set UNDO_MANAGEMENT=AUTO with a large undo tablespace and AUTOEXTEND ON.

-- Dynamically extend undo tablespace during import if needed
ALTER TABLESPACE undotbs1 ADD DATAFILE '/u01/oradata/orcl/undo02.dbf'
  SIZE 4G AUTOEXTEND ON NEXT 512M MAXSIZE 20G;

Q51 How do you minimise redo generation during a large IMPDP operation? ▶

Use the TRANSFORM=DISABLE_ARCHIVE_LOGGING:Y parameter introduced in Oracle 12c. It suppresses redo logging for table and/or index creation during import:

-- Disable archive logging for all objects (tables and indexes)
impdp scott/*** directory=dp_dir dumpfile=scott.dmp schemas=scott \
  transform=disable_archive_logging:y

-- Disable only for indexes
impdp scott/*** directory=dp_dir dumpfile=scott.dmp schemas=scott \
  transform=disable_archive_logging:y:index

-- Disable for tables; keep logging for indexes
impdp scott/*** directory=dp_dir dumpfile=scott.dmp schemas=scott \
  transform=disable_archive_logging:y:table

This transform does not completely eliminate redo — a small amount is still generated for internal Oracle operations. Also, the database parameter FORCE_LOGGING overrides this transform and cannot be bypassed. After import with this option, take a backup immediately as the imported objects are not recoverable to a point before the import.

Q52 How do you estimate the dump file size before running the export? ▶

-- Estimate using BLOCKS method (fast, based on allocated blocks)
expdp system/*** schemas=hr directory=dp_dir \
  estimate_only=y estimate=blocks logfile=estimate.log

-- Estimate using STATISTICS method (slower, uses optimizer stats)
expdp system/*** schemas=hr directory=dp_dir \
  estimate_only=y estimate=statistics logfile=estimate.log

ESTIMATE_ONLY=Y runs the export job up to the point of estimation then stops — no dump file is created. The log shows the estimated size per table and a total.

BLOCKS is faster and sufficient for most planning. STATISTICS is more accurate but slower as it reads optimizer statistics to derive row counts and average row sizes.

Add 10–20% buffer to the estimate for metadata, indexes, and compression variability when provisioning storage.

Q54 What compression ratio can you expect from the COMPRESSION parameter? ▶

Typical compression with COMPRESSION=ALL reduces dump file size by approximately 70–80% compared to an uncompressed export — meaning a 100 GB uncompressed dump might compress to 20–30 GB.

Actual ratio depends heavily on data type and content:

Text-heavy columns (VARCHAR2 with repeating values): excellent compression (85%+).
Already-compressed LOBs, BLOBs, or encrypted columns: minimal benefit.
Numeric data with high cardinality: moderate compression (50–60%).

expdp system/*** schemas=hr dumpfile=hr_compressed.dmp \
  directory=dp_dir compression=all logfile=exp.log

Q55 What techniques do you use when copying a multi-TB database with Data Pump? ▶

For very large databases, a combined strategy works best:

Use maximum PARALLEL: Set PARALLEL to 2× CPU count and provide matching dumpfile names with %U substitution. Spread dump files across multiple storage volumes.
Exclude indexes and constraints on export: Import data first, then create indexes in parallel using a separate SQL script. Indexes on bulk-loaded data are built far faster than maintained incrementally.
DISABLE_ARCHIVE_LOGGING (12c): Suppress redo during import to dramatically reduce I/O.
COMPRESSION=ALL: Reduces I/O for both the export write and subsequent import read.
Network Link import: Skip the dump file entirely — transfer directly from source to target over a database link, saving the write-then-read of dump files.
Consider Transportable Tablespaces for very large databases: copies datafiles directly at the OS level (much faster than row-level export) then imports only the metadata via Data Pump.

-- Large parallel export
expdp system/*** full=y parallel=8 compression=all \
  dumpfile=full_%U.dmp directory=dp_dir exclude=index,constraint \
  logfile=full_export.log

Q57 How do you verify that an export dump file is valid and uncorrupted? ▶

Use IMPDP with SQLFILE to perform a metadata-only parse of the dump file — if the file is corrupt, this will error before any data is imported:

impdp system/*** dumpfile=myexport.dmp directory=dp_dir \
  full=y sqlfile=validate_check.sql nologfile=y

Additional checks:

Verify file size matches the export log’s stated dump file size.
Compare MD5 checksums if the file was transferred: md5sum myexport.dmp on both source and destination.
Check the export log file for any ORA- errors or “Job completed with X errors” messages.
Run a test import of a single small table to confirm the file is readable end-to-end.

Q58 How can you import objects without creating a dump file? ▶

Use the NETWORK_LINK parameter. This connects directly to the source database over a database link and imports objects without creating or reading any dump file. All data flows over SQL*Net in real time:

-- Create a database link to the source
CREATE DATABASE LINK source_db
  CONNECT TO system IDENTIFIED BY ***
  USING 'source_tns';

-- Import directly from source, no dump file needed
impdp system/*** network_link=source_db schemas=hr \
  directory=dp_dir logfile=network_import.log \
  remap_schema=hr:hr_new

This is ideal for online migrations where storage for dump files is limited, or when you want to avoid the write-then-read overhead of dump file generation.

Q59 How do you change the parallelism of a Data Pump job while it is already running? ▶

Attach to the running job’s interactive interface and use the PARALLEL command:

-- Attach to a running job (find the job name from DBA_DATAPUMP_JOBS)
expdp system/*** attach=system.sys_export_schema_01

-- In the interactive prompt, change parallelism on the fly
Export> PARALLEL=8

-- Then detach and let it continue
Export> CONTINUE_CLIENT  -- or just CTRL+C to detach

Increasing parallelism adds more worker processes immediately. Decreasing it causes idle workers to finish their current task and then exit — the change is not instantaneous. Remember that adding workers also requires additional dump files to write to.

Q60 Which directory is used if you don’t specify the DIRECTORY parameter? ▶

Oracle uses the predefined directory object DATA_PUMP_DIR, which is created automatically when a database is created or upgraded. By default it points to:

$ORACLE_BASE/admin/$ORACLE_SID/dpdump/

You can verify its path:

SELECT directory_path FROM dba_directories
WHERE directory_name = 'DATA_PUMP_DIR';

While DATA_PUMP_DIR is convenient for quick tests, it is best practice to explicitly specify a directory in production scripts — this makes the file location unambiguous and avoids filling the Oracle admin directory with large dump files.

Q61 What are the four mechanisms Data Pump uses to move data in and out of a database? ▶

Data Pump can move data using four distinct mechanisms depending on the parameters used:

Direct Path API: The default and fastest method. Reads data blocks directly from datafiles, bypassing the SQL layer and buffer cache. Used for table data extraction in normal exports.
External Tables: Used as a bridge for certain export/import scenarios and for the VIEWS_AS_TABLES feature. Oracle creates an external table definition and reads data through it.
Database Links (Network Mode): When NETWORK_LINK is specified, Data Pump moves data directly between two databases over SQL*Net — no dump file is involved.
Transportable Tablespace: For TRANSPORT_TABLESPACES mode, Data Pump exports only the metadata (tablespace structure, object definitions) while the datafiles are physically copied at the OS level. This is the fastest method for moving large volumes of data.

EXPDP IMPDP Datapump interview questions

EXPDP / IMPDP Data Pump Interview Questions

Fundamentals & Architecture

Cross-version & Compatibility

Advanced Operations & Tuning

Leave a Comment Cancel Reply