Tag: Oracle EBS

ORA-00257: Archiver Error in Oracle 19c CDB — Diagnosis and Resolution

Background

This post documents a real production incident where an Oracle E-Business Suite (EBS) environment became completely unresponsive due to a full Fast Recovery Area (FRA). Users were reporting that everything was either slow or hanging — a classic sign that something fundamental had broken at the database layer.

Symptoms

EBS application users reporting sessions hanging or extremely slow response
sqlplus apps/<password> failing with:

ORA-00257: Archiver error. Connect AS SYSDBA only until resolved.

Non-SYSDBA connections completely blocked
30 FNDLIBR (Concurrent Manager) processes running on the application server — higher than expected for a QA environment

Environment

Oracle Database 19c (CDB/PDB architecture)
PDB: Application database PDB
EBS 12.2 on AIX
FRA configured on ASM diskgroup (+RECO), size 4095 GB
No explicit log_archive_dest — archivelogs defaulting to FRA

Step 1 — Identify the Root Cause

Connected to the CDB as SYSDBA and confirmed the error:

SHOW PARAMETER db_recovery_file_dest;

NAME                        TYPE        VALUE
--------------------------- ----------- ------
db_recovery_file_dest       string      +RECO
db_recovery_file_dest_size  big integer 4095G

Checked FRA usage:

COLUMN name           FORMAT A10
COLUMN limit_gb       FORMAT 999,999.99 HEADING 'LIMIT GB'
COLUMN used_gb        FORMAT 999,999.99 HEADING 'USED GB'
COLUMN reclaimable_gb FORMAT 999,999.99 HEADING 'RECLAIMABLE GB'
COLUMN number_of_files FORMAT 99999     HEADING 'FILES'

SELECT name,
       ROUND(space_limit/1024/1024/1024,2)       limit_gb,
       ROUND(space_used/1024/1024/1024,2)        used_gb,
       ROUND(space_reclaimable/1024/1024/1024,2) reclaimable_gb,
       number_of_files
FROM   v$recovery_file_dest;

Output:

NAME         LIMIT GB     USED GB  RECLAIMABLE GB  FILES
---------- ----------- ----------- -------------- ------
+RECO         4,095.00    4,075.65            .00   2426

FRA was at 99.5% — 4,075 GB used out of 4,095 GB, zero reclaimable, 2,426 archivelog files.

This was the root cause. With the FRA full and nothing reclaimable, the archiver (ARCn) process could not write new archive logs, blocking all database activity.

Step 2 — Assess Backup Status Before Taking Action

Before deleting any archivelogs, it is critical to understand the backup posture. Blindly deleting archivelogs without knowing the backup state can leave the database unrecoverable.

Started an RMAN session and ran a crosscheck first:

rman target /

RMAN> crosscheck archivelog all;

All 2,416 objects returned “validation succeeded” — no expired logs. This meant the FRA was genuinely full of valid, undeleted archivelogs.

Next, checked backup history:

RMAN> list backup summary;

Backups were listed daily going back several weeks — however, on closer inspection:

SELECT session_key,
       TO_CHAR(start_time,'DD-MON-YY HH24:MI') start_time,
       TO_CHAR(end_time,'DD-MON-YY HH24:MI')   end_time,
       status,
       input_bytes,
       output_bytes
FROM   v$rman_backup_job_details
ORDER  BY start_time DESC
FETCH FIRST 10 ROWS ONLY;

Output revealed a critical finding:

SESSION_KEY  START_TIME         END_TIME           STATUS    INPUT_BYTES  OUTPUT_BYTES
-----------  -----------------  -----------------  --------  -----------  ------------
      19986  30-APR-26 01:00    30-APR-26 01:00    COMPLETED           0         98304
      19904  29-APR-26 01:00    29-APR-26 01:00    COMPLETED           0         98304

INPUT_BYTES = 0 — no datafiles were being backed up
OUTPUT_BYTES = 98304 (96 KB) — only a controlfile/SPFILE autobackup
Jobs completing in under a minute — impossible for a real full DB backup

Additionally:

RMAN> list backup of archivelog all;
-- specification does not match any backup in the repository

No archivelog backups existed at all. The scheduled backup job was not performing proper datafile or archivelog backups — a separate issue to be addressed post-incident.

Step 3 — Escalation and Approval

Given that there were no archivelog backups and the datafile backup was questionable, the decision to delete archivelogs and Approval was obtained with the instruction to start conservatively — delete older than 15 days first, then 7 days if needed.

Step 4 — Resolution

First attempt — delete older than 15 days:

RMAN> delete noprompt archivelog until time 'SYSDATE-15';

RMAN returned warnings:

RMAN-08138: warning: archived log not deleted - must create more backups

This is because the RMAN retention policy was blocking deletion of logs that had never been backed up. With Tier 3 approval, used the force option to override:

RMAN> delete noprompt force archivelog until time 'SYSDATE-15';

This ran successfully, deleting archivelogs from January through late April.

FRA usage after deletion:

USED_GB  RECLAIMABLE_GB  NUMBER_OF_FILES
-------  --------------  ---------------
 246.71               0              149

Over 3,800 GB freed — FRA dropped from 99.5% to ~6%.

Step 5 — Verification

Forced a log switch and verified archiver status:

ALTER SYSTEM ARCHIVE LOG CURRENT;

SELECT dest_id, status, error
FROM   v$archive_dest
WHERE  status != 'INACTIVE';

Output:

DEST_ID  STATUS    ERROR
-------  --------  -----
      1  VALID

Archiver resumed successfully. Application connectivity was restored and users confirmed sessions were working normally.

Redo Log Status Check

As part of the verification, also confirmed redo log health:

SELECT thread#,
       sequence#,
       TO_CHAR(first_time,'DD-MON-YY HH24:MI:SS') first_time,
       first_change#,
       archived,
       status
FROM   v$log
ORDER  BY thread#, sequence#;

THREAD#  SEQUENCE#  FIRST_TIME                FIRST_CHANGE#  ARC  STATUS
-------  ---------  ------------------------  -------------  ---  -------
      1       2417  05-MAY-26 01:56:44         1.7867E+10    NO   INACTIVE
      1       2418  05-MAY-26 03:27:20         1.7867E+10    NO   INACTIVE
      1       2419  05-MAY-26 07:57:10         1.7868E+10    NO   CURRENT

No stuck or unarchived redo logs — database in healthy state.

Incident Summary

Item	Detail
Error	ORA-00257: Archiver error
Root Cause	FRA (+RECO ASM diskgroup) at 99.5% capacity
FRA Before	4,075 GB used / 2,426 files
FRA After	246 GB used / 149 files
Action Taken	`delete noprompt force archivelog until time 'SYSDATE-15'`
Space Freed	~3,829 GB
Resolution Time	~45 minutes from identification to restoration

Key Lessons Learned

1. Monitor FRA Proactively

Set up OEM 13c threshold alerts on FRA usage at 70% and 85%. Do not wait for ORA-00257 to discover the problem.

SELECT ROUND(space_used/space_limit*100,2) pct_used
FROM   v$recovery_file_dest;

Alert when this exceeds 80%.

2. Always Check Backup Status Before Deleting Archivelogs

In this incident, the backup job appeared to be running daily but was only backing up the controlfile (INPUT_BYTES=0). This is a serious gap — verify actual backup content, not just job status.

3. Investigate the Backup Job

A proper RMAN backup script for a 19c CDB should include:

BACKUP DATABASE PLUS ARCHIVELOG DELETE INPUT;

4. Consider an Archivelog Deletion Policy

If archivelog backups are not being taken, configure an RMAN retention policy to prevent FRA accumulation:

CONFIGURE RETENTION POLICY TO RECOVERY WINDOW OF 7 DAYS;

5. FRA Sizing Review

Even at 4095 GB, the FRA filled up due to months of uncleaned archivelogs. Review archivelog generation rate and size FRA accordingly, or implement regular cleanup.

Conclusion

ORA-00257 is one of those errors that brings an entire EBS environment to its knees instantly. The fix itself is straightforward — free up FRA space — but the investigation matters. Rushing to delete archivelogs without understanding the backup posture can result in an unrecoverable database.

In this case, careful investigation revealed a deeper issue with the backup job that would have gone unnoticed had we not looked. Always verify, always escalate, and always get approval before deleting recovery-critical files.

Syed Anwar Ahmed is an Oracle Apps DBA with over 11 years of production experience across Oracle EBS, Database, RAC, GoldenGate, and OEM environments. He writes about real-world Oracle incidents at syedanwarahmedoracle.blog.

06/05/2026

Oracle EBS 12.2 — ADOP fs_clone Failure: Failed to Delete FMW_Home (Root Cause & Fix)

Category: Oracle EBS 12.2 | Topic: ADOP Patching | Difficulty: Intermediate | Oracle Support: Search ADOP fs_clone Failed to delete FMW_Home on My Oracle Support

Introduction

Oracle EBS 12.2 introduced Online Patching (ADOP), which relies on a dual file system architecture — a Run File System (fs2) where production runs, and a Patch File System (fs1) where patches are applied. The fs_clone phase synchronises fs1 from fs2 at the start of each patching cycle, making fs1 a fresh copy of the production file system.

One of the most common issues encountered during fs_clone is a failure while trying to delete the FMW_Home directory on the Patch FS. This blog walks through a real production scenario on a 2-node RAC database with 4 application server nodes — covering the exact error, step-by-step diagnostic process, root cause identification, fix applied, and the final successful run with actual timings. All server-specific details have been anonymised.

This issue applies to Oracle EBS 12.2.x on all platforms. For related Oracle Support articles, search “ADOP fs_clone Failed to delete FMW_Home” on My Oracle Support.

Environment

Parameter	Value
Application	Oracle E-Business Suite 12.2 (2-Node RAC DB + 4 Application Server Nodes)
ADOP Version	C.Delta.13
ADOP Session ID	129
Run File System (fs2)	/u01/app/fs2 (Production — active)
Patch File System (fs1)	/u01/app/fs1 (Inactive — patching target)
Shared Storage	NFS-mounted shared volume (1.3T, 447G free)
OS User	applmgr

Incident Timeline

Event	Timestamp	Duration	Outcome
1st run started	Apr 11, 2026 17:07:35	—	time adop phase=fs_clone executed
1st run failed	Apr 11, 2026 ~18:41	~1h 34m	FATAL ERROR — FMW_Home deletion failed
Diagnosis performed	Apr 11, 2026 18:45–19:44	~59m	Root cause identified — root-owned OHS log files
Fix applied (mv)	Apr 11, 2026 ~19:44	Seconds	FMW_Home renamed to dated backup as applmgr
2nd run started	Apr 11, 2026 19:45	—	time adop phase=fs_clone re-executed after fix
2nd run completed	Apr 12, 2026 00:49:08	5h 21m 25s	SUCCESS — all 4 app nodes completed ✅
Total session elapsed	Apr 11 17:07 → Apr 12 00:49	7h 46m 33s	Full session including failed run + fix + retry

The Issue

During time adop phase=fs_clone, the synchronisation process was progressing normally — staging the file system clone, detaching Oracle Homes, removing APPL_TOP and COMM_TOP — until it reached stage 6 (REMOVE-1012-ORACLE-HOME) inside the removeFMWHome() function, where it attempted to delete the FMW_Home directory on the Patch FS and hit a fatal error.

Error on the Console

fs_clone/remote_execution_result_level1.xml:
*******FATAL ERROR*******
PROGRAM : (.../fs2/EBSapps/appl/ad/12.0.0/patch/115/bin/txkADOPPreparePhaseSynchronize.pl)
TIME    : Apr 11 18:41:40 2026
FUNCTION: main::removeDirectory [ Level 1 ]
ERRORMSG: Failed to delete the directory /u01/app/fs1/FMW_Home.
[UNEXPECTED]fs_clone has failed

Key Log File: txkADOPPreparePhaseSynchronize.log

The primary log file is located at:

$ADOP_LOG_DIR/<session_id>/<timestamp>/fs_clone/<node>/TXK_SYNC_create/
    txkADOPPreparePhaseSynchronize.log

Inside this log, the clone status progression was clearly visible:

========================== Inside getCloneStatus()... ==========================
clone_status             = REMOVE-1012-ORACLE-HOME
clone_status_from_caller = 7
clone_status_from_db     = 6
Removing the directory: /u01/app/fs1/FMW_Home
Failed to delete the directory /u01/app/fs1/FMW_Home.
*******FATAL ERROR*******
FUNCTION: main::removeDirectory [ Level 1 ]
ERRORMSG: Failed to delete the directory /u01/app/fs1/FMW_Home.

clone_status_from_db = 6 indicates the process had already completed: fs_clone staging, detach of Oracle Homes, removal of APPL_TOP, COMM_TOP, and 10.1.2 Oracle Home. It failed specifically and only while removing FMW_Home.

ADOP fs_clone Stage Flow

Stage DB	Clone Status	Description
1	STARTED	Session initialised
2	FSCLONESTAGE-DONE	File system staging completed
3	DEREGISTER-ORACLE-HOMES	Oracle Homes deregistered from inventory
4	REMOVE-APPL-TOP	APPL_TOP removed from Patch FS
5	REMOVE-COMM-TOP	COMM_TOP removed from Patch FS
6	REMOVE-1012-ORACLE-HOME	Removing FMW_Home — ❌ FAILED HERE
7+	(clone proceeds…)	Clone fs2 to fs1, re-register homes, config clone

Diagnostic Steps

Step 1 — Confirm You Are on the Correct File System

Most critical check first. The target must be on Patch FS (fs1), never Run FS (fs2):

echo $FILE_EDITION   # Must show: run
echo $RUN_BASE       # Must show path to fs2

$ echo $FILE_EDITION
run
$ echo $RUN_BASE
/u01/app/fs2

⚠️ If FILE_EDITION shows patch, stop immediately — source the Run FS environment before proceeding.

Step 2 — Check for Open File Handles

lsof +D /u01/app/fs1/FMW_Home 2>/dev/null
fuser -cu /u01/app/fs1/FMW_Home 2>&1

In our case both commands returned empty output — no active process was holding FMW_Home open.

Step 3 — Identify Root-Owned Files

find /u01/app/fs1/FMW_Home ! -user applmgr -ls 2>/dev/null

Output revealed multiple root-owned files under the OHS instance directories:

drwxr-x---  3 root root 4096 Feb 15 06:48 .../EBS_web_OHS4/auditlogs/OHS
-rw-------  1 root root    0 Feb 15 06:48 .../EBS_web_OHS4/diagnostics/logs/OHS/EBS_web/sec_audit_log
-rw-r-----  1 root root 5670 Feb 15 06:49 .../EBS_web_OHS4/diagnostics/logs/OHS/EBS_web/EBS_web.log
-rw-r-----  1 root root  249 Feb 15 06:49 .../EBS_web_OHS4/diagnostics/logs/OHS/EBS_web/access_log
... (same pattern for EBS_web_OHS2 and EBS_web_OHS3)

All root-owned files were dated February 15 — nearly 2 months stale. This confirmed they were leftovers from OHS being incorrectly started as root during a previous patching cycle. No active process was involved.

Step 4 — Attempt Manual Delete to Confirm the Error

rm -rf /u01/app/fs1/FMW_Home 2>&1 | head -5

rm: cannot remove '.../EBS_web_OHS4/diagnostics/logs/OHS/EBS_web/sec_audit_log': Permission denied

This confirmed the issue was purely a file ownership/permission problem — not filesystem corruption or an NFS issue.

Step 5 — Check Disk Space

df -h /u01/app/fs1
Filesystem      Size  Used Avail Use% Mounted on
nfs_server:/vol  1.3T  844G  447G  66% /u01

447GB free — sufficient to retain a backup of FMW_Home by renaming it.

Root Cause Analysis

The root cause was OHS (Oracle HTTP Server) being started as root on the Patch File System during a previous patching cycle in February 2026. This created log and audit files owned by root under:

/u01/app/fs1/FMW_Home/webtier/instances/EBS_web_OHS2/auditlogs/OHS/
/u01/app/fs1/FMW_Home/webtier/instances/EBS_web_OHS2/diagnostics/logs/OHS/EBS_web/
/u01/app/fs1/FMW_Home/webtier/instances/EBS_web_OHS3/  (same structure)
/u01/app/fs1/FMW_Home/webtier/instances/EBS_web_OHS4/  (same structure)

Since fs_clone runs as applmgr, and applmgr cannot delete files owned by root, the removeDirectory() function in txkADOPPreparePhaseSynchronize.pl failed with Permission Denied — surfaced as a fatal error.

Why did OHS create root-owned files? If OHS start/stop scripts are executed as root or with sudo (instead of using applmgr-owned wrapper scripts), the resulting log and audit files are created with root ownership and persist on the Patch FS across patching cycles.

Pre-Action Safety Checklist

Check	Expected	Result
FILE_EDITION = run	run	✅ PASS
RUN_BASE points to fs2	/u01/app/fs2	✅ PASS
FMW_Home target is on fs1 (Patch FS only)	fs1 only	✅ PASS
lsof returns empty (no open handles)	Empty	✅ PASS
Root-owned files are stale (no active processes)	Stale only	✅ PASS
Sufficient disk space for backup rename	> 50GB free	✅ PASS
Production services confirmed running on fs2	fs2 up	✅ PASS

Solution — Move FMW_Home as Backup

The safest approach on production is to move (rename) FMW_Home rather than deleting it. This avoids the need for root access entirely, completes in seconds, and preserves a backup.

Why mv works even with root-owned files: mv on the same filesystem is a purely atomic rename at the directory level. It does not touch or modify any file contents inside the directory — so applmgr can rename FMW_Home even if files inside are owned by root. This is fundamentally different from rm -rf, which must access and remove each individual file.

Step 1 — Move FMW_Home as a Dated Backup

mv /u01/app/fs1/FMW_Home /u01/app/fs1/FMW_Home_$(date +%d%b%Y)_bkp && echo "MOVE SUCCESSFUL"

MOVE SUCCESSFUL

Step 2 — Verify FMW_Home Is Gone

ls -lrt /u01/app/fs1/

Step 3 — Confirm You Are applmgr Before Retrying

whoami
# Expected output: applmgr

⚠️ Never run adop as root. Always confirm whoami shows applmgr before executing any adop command.

Step 4 — Retry fs_clone

time adop phase=fs_clone

Running fs_clone Safely on Production

time adop phase=fs_clone on a 2-node RAC with 4 application server nodes takes several hours. Never run it in a plain SSH/PuTTY session that could disconnect. Use one of the following:

VNC Session (Best): Network drops have zero impact on the running process.
nohup: nohup adop phase=fs_clone > /tmp/fsclone_$(date +%Y%m%d_%H%M%S).log 2>&1 &
screen: screen -S fsclone then time adop phase=fs_clone. Detach with Ctrl+A D, reattach with screen -r fsclone.

Successful Run — 2nd Attempt

After applying the fix, time adop phase=fs_clone was re-executed. The adopmon output confirmed all 4 application nodes progressing through validation, port blocking, clone steps, and config clone phases without any errors.

ADOP (C.Delta.13)
Session Id: 129
Command:    status
Node Name   Node Type  Phase        Status     Started               Finished              Elapsed
----------  ---------  -----------  ---------  --------------------  --------------------  -------
app-node1   master     FS_CLONE     COMPLETED  2026/04/11 17:07:35   2026/04/12 00:49:08   7:46:33
app-node2   slave      CONFIG_CLONE COMPLETED  2026/04/11 17:07:36   2026/04/12 01:01:55   7:47:19
app-node3   slave      CONFIG_CLONE COMPLETED  2026/04/11 17:07:36   2026/04/12 01:01:25   7:47:49
app-node4   slave      CONFIG_CLONE COMPLETED  2026/04/11 17:07:36   2026/04/12 01:02:16   7:47:40
File System Synchronization Type: Full
adop exiting with status = 0 (Success)

Summary report for current adop session:
    Node app-node1:  - Fs_clone status: Completed successfully
    Node app-node2:  - Fs_clone status: Completed successfully
    Node app-node3:  - Fs_clone status: Completed successfully
    Node app-node4:  - Fs_clone status: Completed successfully
adop exiting with status = 0 (Success)
real    321m25.733s   (5 hours 21 minutes 25 seconds)
user     40m1.142s
sys      70m59.804s

Node	Type	Started	Finished	Elapsed
app-node1	Master	Apr 11, 2026 17:07:35	Apr 12, 2026 00:49:08	7h 46m 33s
app-node2	Slave	Apr 11, 2026 17:07:36	Apr 12, 2026 01:01:55	7h 47m 19s
app-node3	Slave	Apr 11, 2026 17:07:36	Apr 12, 2026 01:01:25	7h 47m 49s
app-node4	Slave	Apr 11, 2026 17:07:36	Apr 12, 2026 01:02:16	7h 47m 40s

The 2nd run completed cleanly in 5 hours 21 minutes 25 seconds across all 4 application nodes. File System Synchronization Type: Full.

Post-Resolution Cleanup

After a successful fs_clone and full patching cycle, old FMW_Home backups can be removed. Keep the most recent backup until the next patching cycle completes, then clean up older ones as root (since they may contain root-owned files):

ls -lrt /u01/app/fs1/FMW_Home*
du -sh /u01/app/fs1/FMW_Home*
# Remove old backups as root
sudo rm -rf /u01/app/fs1/FMW_Home_<old_date>_bkp

Prevention — Avoiding Recurrence

Never start OHS as root. Always use applmgr-owned wrapper scripts. Never use sudo or root to run adohs.sh or adadminsrvctl.sh.
Post-patching ownership check. After every adop finalize/cutover, run: find /u01/app/fs1 ! -user applmgr -ls 2>/dev/null | head -20
Pre-fs_clone health check. Verify no lingering adop sessions, confirm Run FS services are healthy, check disk space, and verify no root-owned files under fs1/FMW_Home before starting.

Summary

Item	Detail
Phase	adop phase=fs_clone
Failing Function	main::removeDirectory inside removeFMWHome()
Clone Stage	clone_status_from_db = 6 (REMOVE-1012-ORACLE-HOME)
Root Cause	OHS started as root in a previous cycle — stale root-owned OHS log/audit files blocking applmgr deletion
Production Impact	None — fs1 is Patch FS, production ran on fs2 throughout
Fix Applied	mv FMW_Home to dated backup as applmgr — atomic rename, no root needed, completed in seconds. rm -rf was NOT used.
1st Run Duration	~1h 34m before fatal error (Apr 11 17:07 → 18:41)
2nd Run Duration	5h 21m 25s — completed successfully (Apr 11 19:45 → Apr 12 00:49)
Total Session Elapsed	7h 46m 33s (including failed run, diagnosis, fix, and retry)
Final Status	adop exiting with status = 0 (Success) — all 4 app nodes completed ✅
Prevention	Never start OHS as root; add post-patching ownership check to runbook
Oracle Support	Search “ADOP fs_clone Failed to delete FMW_Home” on My Oracle Support

Happy Debugging! All server-specific details have been anonymised. The diagnostic commands and fix are generic and applicable to any Oracle EBS 12.2.x environment. If this helped you, feel free to share with the community.

12/04/2026