Tag: ebs

  • Fixing Oracle EBS 12.2 Forms Not Opening (Step-by-Step Guide)

    Oracle E-Business Suite (EBS) 12.2 uses Java Web Start (JNLP) to launch forms, unlike older versions like 11i which relied on browser plugins. If your forms are not opening, this guide will help you fix the issue completely.

    Common Symptoms

    • Forms not opening after click
    • .jnlp file downloads but does nothing
    • Application Blocked by Java Security error
    • Nothing happens when clicking forms

    Root Causes

    • Using unsupported Java version (Java 11, 17, or 24)
    • Missing Java Web Start (OpenWebStart)
    • Java security blocking self-signed certificates
    • Old EBS 11i configurations applied to 12.2 environment

    Step-by-Step Solution

    Step 1: Remove Unsupported Java Versions

    Uninstall Java 11, 17, 24 or any non-Java 8 version from Control Panel → Programs → Uninstall a Program.

    Verify removal:

    java -version

    Expected output:

    'java' is not recognized as an internal or external command

    Also clean up residual folders if they exist:

    C:\Program Files\Java\
    C:\Program Files (x86)\Java\
    C:\Users\<username>\AppData\LocalLow\Sun\Java\

    Step 2: Install Java 8

    Download and install Java 8 JRE (1.8.0_xxx) from Oracle’s website. Verify after installation:

    java -version

    Expected output:

    java version "1.8.0_xxx"

    Step 3: Install OpenWebStart

    Oracle EBS 12.2 requires Java Web Start to launch .jnlp files. Since modern Java versions removed the built-in Web Start, install OpenWebStart as a replacement. Download from: https://openwebstart.com

    Step 4: Configure Java Security

    Go to Control Panel → Java → Security → Edit Site List and add your EBS URLs:

    http://your-ebs-url
    http://your-ebs-url:port
    https://your-ebs-url:port

    Step 5: Clear Java Cache

    Go to Java Control Panel → General → Delete Files, select all options, and click OK.

    Step 6: Associate JNLP Files with OpenWebStart

    • Right-click any .jnlp file
    • Select Open with → Choose another app
    • Select OpenWebStart
    • Check Always use this app

    Fix for “Application Blocked by Java Security”

    If you see the Application Blocked by Java Security error, add your EBS URL to the Exception Site List as described in Step 4. This allows self-signed certificates to run without being blocked.

    Expected Behavior After Configuration

    1. Login to EBS 12.2
    2. Click a Form-based responsibility
    3. .jnlp file launches via OpenWebStart
    4. Oracle Form opens successfully

    Important Notes

    • Do not use Java 7 — that is only required for EBS 11i
    • IE Mode in Edge is not required for EBS 12.2
    • Always use the Java 8 + OpenWebStart combination
    • Ensure popups and file downloads are allowed in your browser

    Summary

    ComponentRequirement
    JavaJava 8 (required)
    LauncherOpenWebStart (required)
    Browser PluginNot required
    IE ModeNot required

    Conclusion

    Most EBS 12.2 Forms issues come down to incorrect Java setup or missing OpenWebStart. If you previously configured your machine for EBS 11i, fully revert those settings before applying the 12.2 configuration — conflicting setups are one of the most common causes of Forms not opening.

    — Syed Anwar Ahmed | Oracle Apps DBA | LinkedIn

  • Rethinking UNDO Tablespace Monitoring in Multitenant Environments: A Case Study on APPS_UNDOTS1

    Introduction

    A production alert indicating UNDO tablespace usage at ~99% is typically treated as a critical issue. However, in Oracle Multitenant environments with local UNDO enabled, this metric can be misleading when evaluated using traditional tablespace monitoring techniques.

    This article presents a real-world scenario involving the PDB-level UNDO tablespace APPS_UNDOTS1, where utilization reached ~99%, yet no actual resource contention existed.


    The Incident

    An alert was triggered showing:

    Tablespace              TOTAL_MB     USED_MB     FREE_MB PCT_USED
    -------------------- ----------- ----------- ----------- --------
    APPS_UNDOTS1 28,863.25 28,681.56 181.69 99.37

    At first glance, this indicated:

    • UNDO nearly full
    • Minimal free space
    • Risk of ORA-30036 / ORA-01555

    Environment Context

    This issue occurred in a Multitenant environment with Local UNDO enabled:

    • UNDO tablespace: APPS_UNDOTS1
    • Scope: PDB-level
    • Each PDB maintains its own UNDO

    UNDO behavior must always be analyzed at the PDB level, not just at the CDB level.


    Investigation Approach

    Rather than relying on % used, a state-based analysis was performed.


    Step 1: Active Transactions

    SELECT COUNT(*) FROM v$transaction;

    Observation:

    • Minimal active transactions
    • No ongoing workload pressure

    Step 2: UNDO Extent Analysis

    SELECT status, SUM(bytes)/1024/1024 MB
    FROM dba_undo_extents
    GROUP BY status;

    Key Observation

    STATUS        MB         PCT
    ----------- -------- -------
    ACTIVE ~3 MB ~0.01%
    UNEXPIRED ~51 MB ~0.18%
    EXPIRED ~28 GB ~99.81%

    💡 Critical Insight

    ~99.81% of UNDO (APPS_UNDOTS1) was EXPIRED and fully reusable

    This means:

    • No real space pressure
    • No transaction risk
    • Tablespace is effectively available

    Understanding UNDO Behavior

    UNDO extents follow a lifecycle:

    ACTIVE → UNEXPIRED → EXPIRED → REUSED
    StateDescriptionReusable
    ACTIVEUsed by active transactions
    UNEXPIREDRetained for consistency⚠️
    EXPIREDNo longer needed

    UNDO extents are not physically freed but logically reused by Oracle based on demand.


    Why the Alert Was Misleading

    The alert was based on:

    SELECT * FROM dba_free_space;

    Limitation

    • Shows only physically free space
    • Does not reflect reusable UNDO
    • Ignores Oracle’s internal reuse mechanism

    Root Cause

    A long-running concurrent request:

    • Ran for ~30 hours
    • Performed heavy DELETE operations
    • Generated significant UNDO

    During execution:

    • UNDO reached ~99% → real pressure

    After completion:

    • UNDO became EXPIRED
    • Space became reusable
    • Alert persisted → false positive

    When UNDO Is Actually a Problem

    ACTIVE      > 20–30%
    UNEXPIRED very high
    EXPIRED very low

    UNDO becomes a real issue only when both EXPIRED and UNEXPIRED extents are exhausted and no reusable space remains.


    Recommended Monitoring Approach

    WITH undo AS (
    SELECT status, SUM(bytes)/1024/1024 mb
    FROM dba_undo_extents
    GROUP BY status
    ),
    total AS (
    SELECT SUM(mb) total_mb FROM undo
    )
    SELECT
    u.status,
    ROUND(u.mb,2) AS mb,
    ROUND((u.mb / t.total_mb) * 100, 2) AS pct
    FROM undo u, total t;

    Improved Alert Logic

    • ACTIVE > 20% → Critical
    • UNEXPIRED > 80% → Warning
    • Otherwise → Normal

    Key Takeaways

    • UNDO is state-driven, not space-driven
    • EXPIRED undo = reusable capacity
    • % used is misleading
    • Always analyze at PDB level in multitenant setups
    • Monitoring must align with Oracle internals

    Conclusion

    UNDO tablespace monitoring must evolve from:

    ❌ Static space-based metrics
    ➡️
    ✅ Dynamic lifecycle-based analysis

    A tablespace showing 99% utilization can still be completely healthy if most of its extents are reusable.


    Level Insight

    “UNDO is not a storage problem — it is a lifecycle problem. The difference between false alarms and accurate diagnosis lies in understanding how Oracle transitions undo extents.”


    Final Thought

    👉 Don’t ask:
    “How full is UNDO?”

    👉 Ask:
    “How much of it is actually in use?”


    Author

    Syed Anwar Ahmed
    Oracle Apps DBA | Oracle EBS | Performance & Troubleshooting

  • When ADOP Remembers Too Much: Fixing Patch Failures Caused by Stale Metadata in Oracle EBS

    During an Oracle E-Business Suite ADOP patching cycle in a multi-node environment, the apply phase failed on one node while completing successfully on others. Despite retries — including downtime mode — the issue persisted, pointing to a deeper inconsistency within the patching framework.


    Symptoms Observed

    • ADOP session status: FAILED
    • Patch applied successfully on some nodes, failed on admin node
    • Repeated failures even with restart=no, abandon=yes, and downtime mode
    • No immediate actionable error from standard logs

    Timeline of Events

    T0 -- Patch execution initiated (ADOP apply phase)
    T1 -- Failure observed on admin node
    T2 -- Retry using downtime mode -- Failure persists
    T3 -- ADOP session review shows inconsistent state
    T4 -- Internal metadata tables analyzed
    T5 -- Cleanup performed (tables + restart directory)
    T6 -- Patch re-executed -- Success across all nodes

    Investigation

    Step 1: Check ADOP Session State

    Query the ADOP session status to understand the current state across all nodes:

    -- Check current ADOP session status
    SELECT session_id, node_name, phase, status,
           start_date, end_date
    FROM applsys.ad_adop_sessions
    ORDER BY start_date DESC;
    
    -- Check apply phase status per node
    SELECT s.session_id, n.node_name, p.phase_code,
           p.status, p.start_date, p.end_date
    FROM applsys.ad_adop_sessions s,
         applsys.ad_adop_session_phases p,
         applsys.fnd_nodes n
    WHERE s.session_id = p.session_id
    AND p.node_id = n.node_id
    ORDER BY p.start_date DESC;

    The existing session showed status FAILED with the apply phase partially completed — a clear indicator of inconsistent execution state across nodes.

    Step 2: Check adalldefaults.txt

    Reviewed the defaults file for any relevant configuration:

    cat $APPL_TOP/admin/$TWO_TASK/adalldefaults.txt | grep -i missing
    -- Key parameter found:
    -- MISSING_TRANSLATED_VERSION = No

    Modifying and retrying with this parameter had no impact, confirming the issue was not translation-related.

    Step 3: Check Install Processes Table

    -- Check for stale entries in FND_INSTALL_PROCESSES
    SELECT COUNT(*) FROM applsys.fnd_install_processes;
    
    -- View stale entries in detail
    SELECT process_status, process_name, last_update_date
    FROM applsys.fnd_install_processes
    ORDER BY last_update_date DESC;
    
    -- Check AD_DEFERRED_JOBS
    SELECT COUNT(*) FROM applsys.ad_deferred_jobs;
    SELECT * FROM applsys.ad_deferred_jobs;

    Observation: FND_INSTALL_PROCESSES contained stale entries from the failed session. AD_DEFERRED_JOBS was empty.


    Root Cause

    The failure was caused by stale and inconsistent ADOP metadata tables — specifically APPLSYS.FND_INSTALL_PROCESSES and APPLSYS.AD_DEFERRED_JOBS. ADOP internally relies on these tables to track patch progress checkpoints, deferred job execution, and restart state management. When these tables retain entries from failed or incomplete sessions, ADOP assumes an incorrect execution state, leading to patch reconciliation failure, apply phase breakdown, and node-level inconsistencies.


    Resolution Steps

    Step 1: Backup Critical Tables

    -- Always backup before any cleanup
    CREATE TABLE applsys.fnd_install_processes_bak AS
    SELECT * FROM applsys.fnd_install_processes;
    
    CREATE TABLE applsys.ad_deferred_jobs_bak AS
    SELECT * FROM applsys.ad_deferred_jobs;
    
    -- Verify backups
    SELECT COUNT(*) FROM applsys.fnd_install_processes_bak;
    SELECT COUNT(*) FROM applsys.ad_deferred_jobs_bak;

    Step 2: Drop Stale Metadata Tables

    Dropping these tables forces ADOP to rebuild clean metadata during the next run:

    DROP TABLE applsys.fnd_install_processes;
    DROP TABLE applsys.ad_deferred_jobs;

    Step 3: Reset the Restart Directory

    The restart directory can silently preserve failure states. Back it up and create a fresh one:

    cd $APPL_TOP/admin/$TWO_TASK
    
    -- Backup existing restart directory
    mv restart restart_bkp_$(date +%Y%m%d)
    
    -- Create fresh restart directory
    mkdir restart
    
    -- Verify
    ls -la | grep restart

    Step 4: Re-run the Patch

    adop phase=apply \
         patches=<patch_id> \
         restart=no \
         abandon=yes \
         apply_mode=downtime

    The patch completed successfully across all nodes after the metadata cleanup.


    Before vs After

    ComponentBefore FixAfter Fix
    ADOP SessionFailedSuccessful
    Node ConsistencyPartialFull
    Restart BehaviorStuckClean
    Patch ExecutionIncompleteCompleted

    Key Takeaways

    • ADOP is state-driven — even when logs appear clean, internal metadata drives execution decisions
    • Partial success is a clue — if some nodes succeed and one fails, focus on local metadata, not the patch itself
    • The restart directory matters — it can silently preserve failure states and must be validated before retrying
    • Downtime mode is not a fix-all — even in downtime, ADOP still reads metadata tables; corruption persists unless cleaned
    • Always backup before cleanup — never drop tables without creating a backup first

    When NOT to Use This Approach

    Avoid applying this fix if the issue is caused by missing database patches (ETCC warnings), file system or permission issues, incorrect patch sequencing, or environment misconfiguration. Always validate the root cause before performing any metadata cleanup.


    This scenario highlights a subtle but critical behavior in ADOP — sometimes patch failures are not caused by the patch itself, but by what the system remembers about past attempts. By resetting stale metadata, we allow ADOP to re-evaluate the environment cleanly, leading to successful execution.

    Have questions or faced a similar issue? Reach out at sdanwarahmed@gmail.com.

  • Oracle EBS Login Issue: Real-Time Production Incident and Fix (OACORE + WebLogic Filter)


    📌 Introduction

    In Oracle E-Business Suite (EBS) environments, login failures are often perceived as simple application issues. However, in complex architectures, they can originate from multiple interacting layers across the application and middleware stack.

    In this blog, I’ll walk through a real-world production incident where an OACORE JVM issue combined with WebLogic security filtering resulted in complete login inaccessibility.

    This case highlights the importance of analyzing both performance and security layers together when troubleshooting critical application outages.


    ⚠️ Issue Summary

    • Users were unable to access the EBS login page
    • Pages were hanging or not loading
    • WebLogic console reported:
    Connection rejected, filter blocked Socket

    🔍 Initial Observation

    From the application server:

    • Load Average: ~10+

    👉 This indicated:

    • High CPU utilization
    • System under heavy stress
    • Potential JVM thread contention

    🔬 Detailed Analysis

    • One of the OACORE managed server JVMs became unresponsive
    • Long-running threads caused thread pool exhaustion
    • Incoming user requests began queueing

    At the same time:

    • WebLogic connection filter was actively enforcing access rules
    • Legitimate requests were being rejected under stressed conditions

    🧠 Understanding the Components

    OACORE (Application Layer)

    Handles:

    • Login requests
    • Forms processing
    • Core application logic

    If JVM threads are exhausted:
    👉 Requests queue → login hangs


    Oracle WebLogic Server Connection Filter

    EBS environments may use:

    oracle.apps.ad.tools.configuration.wls.filter.EBSConnectionFilterImpl

    This filter:

    • Enforces IP-based access control
    • Overrides default allow rules

    If misconfigured or stressed:
    👉 Legitimate traffic may be blocked


    🎯 Root Cause Analysis (RCA)

    The login issue was not caused by a single failure point, but by a combination of application tier resource exhaustion and restrictive middleware-level access control.

    • High CPU utilization and long-running threads caused one OACORE JVM to become unresponsive
    • Thread pool exhaustion led to request queuing, preventing new login requests from being processed

    Simultaneously:

    • The WebLogic connection filter (EBSConnectionFilterImpl) enforced strict access control policies
    • Under high load conditions, legitimate client requests were rejected with “filter blocked Socket”

    This interaction between performance degradation and security enforcement amplified the impact, resulting in complete login inaccessibility despite partial system availability.


    🛠️ Resolution Approach (Controlled & Safe)

    The resolution approach focused on stabilizing the JVM layer while validating and correcting middleware-level access controls in a controlled manner.


    🔹 Step 1: Identify Unresponsive JVM

    ps -ef | grep oacore

    ✔ Identify JVM with abnormal CPU or stuck behavior


    🔹 Step 2: Handle Stuck JVM (Controlled Action)

    ⚠️ Important Note:

    Forcefully terminating JVM processes should NOT be performed without validation.

    ✔ Recommended Approach:

    • Confirm the process is unresponsive
    • Ensure no critical transactions are running
    • Prefer controlled shutdown where possible

    ✔ Example (Only if fully unresponsive and approved):

    kill -9 <PID>

    👉 Node Manager can restart the JVM automatically after termination


    🔹 Step 3: Rolling Restart of OACORE

    admanagedsrvctl.sh stop oacore_server1
    admanagedsrvctl.sh start oacore_server1

    ✔ Ensures clean JVM state
    ✔ Restores thread pool balance


    🔹 Step 4: Validate WebLogic Connection Filter

    ⚠️ Important Note:

    The WebLogic connection filter is a security control and should NOT be disabled permanently.

    ✔ In this case:

    • Filter behavior was validated as part of troubleshooting
    • Temporary relaxation was used to confirm impact

    ✔ Recommended Approach:

    • Review allowed IP ranges
    • Validate filter configuration
    • Re-enable filter after correction

    The filter was re-enabled after validation and correction of configuration.


    🔹 Step 5: Restart Admin Server (If Required)

    adadminsrvctl.sh stop
    adadminsrvctl.sh start

    ✔ Ensures configuration changes are applied


    ✅ Final Outcome

    • System load reduced from ~10 → ~1.5
    • OACORE JVMs stabilized
    • Login page restored
    • Users successfully accessed the application

    📊 Key Learnings

    1. OACORE JVM issues can manifest as complete application outages due to thread exhaustion and request queuing.
    2. WebLogic connection filters can unintentionally block legitimate traffic, especially under high load conditions.
    3. Multi-layer failures (application + middleware + security) significantly amplify incident impact.
    4. System load metrics (CPU, load average) provide early indicators of JVM stress and should be monitored proactively.

    🧩 Preventive Measures

    • Monitor JVM thread utilization proactively
    • Review connection filter configurations periodically
    • Avoid peak concurrent load spikes
    • Implement alerting for high CPU / load conditions
    • Validate session and request handling patterns

    🔐 Governance Considerations

    • Avoid forceful JVM termination without proper validation
    • Do not disable security controls without understanding impact
    • Follow change management procedures in production
    • Coordinate with application and security teams before changes

    🏁 Conclusion

    This incident demonstrates that login failures in Oracle EBS are not always isolated to a single component but can result from complex interactions across application performance and middleware security layers.

    The combination of JVM resource exhaustion and connection filtering behavior created a compounded failure scenario, leading to complete login disruption.

    A structured, multi-layer troubleshooting approach—focused on performance, configuration, and governance—enabled effective resolution while minimizing risk.

    This reinforces the importance of analyzing both system behavior and security controls together when addressing critical production incidents.


    💡 Pro Tip

    When troubleshooting Oracle EBS login issues, always validate both:

    • JVM health (thread utilization, CPU load)
    • Middleware controls (connection filters, access rules)

    Ignoring either layer can lead to incomplete or misleading diagnosis.

  • Oracle EBS OACORE Server in FAILED_NOT_RESTARTABLE State: Real-Time Issue, RCA and Fix

    In Oracle E-Business Suite (EBS) environments, application tier stability is critical to ensure seamless user experience. However, there are scenarios where managed servers behave unexpectedly and require manual intervention. This post walks through a real-world production issue where an OACORE managed server entered a FAILED_NOT_RESTARTABLE state, its impact, root cause analysis, and how it was resolved.


    Environment Details

    • Oracle E-Business Suite: R12.2.x
    • Application Tier: WebLogic Managed Servers
    • Component Impacted: OACORE Server (oacore_server1)
    • Environment Type: Production

    Problem Statement

    An alert was received indicating oacore_server1 was in FAILED_NOT_RESTARTABLE state. Upon verification, the server was running but Node Manager could not auto-restart it.


    Key Observations

    Despite the OACORE server being in a failed state, the application remained accessible and functional — traffic was being handled by other OACORE servers. This is due to the multi-OACORE architecture with load balancing via OHS/Web tier. However, this creates a hidden risk: load redistribution increases pressure on remaining servers and can lead to cascading failures if not addressed promptly.


    Detailed Analysis

    • Managed server restart attempts failed during initialization
    • Bulk concurrent requests were actively running
    • CPU utilization spiked on the application tier
    • JVM resources were under pressure

    Understanding FAILED_NOT_RESTARTABLE

    In Oracle WebLogic Server, a managed server is marked as FAILED_NOT_RESTARTABLE after repeated unsuccessful restart attempts. This is a protective mechanism designed to prevent unstable restart loops when the server cannot recover successfully.


    Root Cause Analysis

    The OACORE managed server entered FAILED_NOT_RESTARTABLE state due to repeated startup failures following an unclean or resource-constrained shutdown. High CPU utilization and heavy concurrent workload placed JVM resources under pressure, preventing a clean restart cycle. Residual runtime artifacts (such as incomplete shutdown state or resource locks) prevented successful reinitialization, causing WebLogic to mark the server as FAILED_NOT_RESTARTABLE.


    Resolution

    cd $ADMIN_SCRIPTS_HOME
    ./admanagedsrvctl.sh stop oacore_server1
    ./admanagedsrvctl.sh start oacore_server1

    After the controlled restart, the server returned to RUNNING state with all deployments active and the application stable.


    Identify Inactive Forms Sessions

    Inactive sessions holding resources can contribute to JVM pressure. Use this query to identify them safely — do not terminate without proper validation and approvals:

    SELECT s.sid,
           s.serial#,
           s.username,
           s.status,
           s.program,
           s.machine,
           ROUND(s.last_call_et/3600,2) AS hours_inactive
    FROM v$session s
    WHERE s.status = 'INACTIVE'
    AND s.username = 'APPS'
    AND s.program LIKE 'frmweb%'
    AND s.last_call_et > 28800   -- 8 hours
    ORDER BY hours_inactive DESC;

    Reference only — do NOT execute without validation:

    ALTER SYSTEM KILL SESSION 'SID,SERIAL#' IMMEDIATE;

    Automate Session Monitoring

    Use this script to monitor inactive sessions every 8 hours via cron:

    #!/bin/bash
    export ORACLE_SID=your_sid
    export ORACLE_HOME=/path/to/oracle_home
    export PATH=$ORACLE_HOME/bin:$PATH
    
    sqlplus -s / as sysdba <<EOF
    SET LINES 200
    SET PAGES 200
    SELECT COUNT(*) AS inactive_sessions
    FROM v\$session
    WHERE status='INACTIVE'
    AND username='APPS'
    AND program LIKE 'frmweb%'
    AND last_call_et > 28800;
    EXIT;
    EOF
    # Crontab entry - every 8 hours
    0 */8 * * * /path/to/inactive_sessions.sh >> /tmp/inactive_sessions.log

    DBA Quick Commands

    -- Check system load
    top
    uptime
    ps -ef | grep oacore
    
    -- Check running concurrent requests
    SELECT request_id, phase_code, status_code
    FROM fnd_concurrent_requests
    WHERE phase_code = 'R';

    Key Takeaways

    • Application may appear healthy even when an OACORE server fails due to load balancing
    • FAILED_NOT_RESTARTABLE is a protective mechanism, not the root cause itself
    • Resource pressure and restart failures must be analyzed together
    • Controlled and governed actions are critical in production environments
    • Proactive session monitoring via automation helps prevent recurrence

    Have questions or faced a similar issue? Reach out at sdanwarahmed@gmail.com.