Wednesday, May 29, 2013

Weblogic 11g Start Stop Script on Linux

How to start Weblogic Service automatically on restart ?
How to restart Weblogic instance  on background ?

This are question that keeps fight our mind for weblogic on unix.  I have made a script to restart Weblogic instance automatically on restart. This will stop and start Weblogic and Server Manager Agent in sequence. You can use this script and change parameters to adjust your environment.

 The below script is for Linux system where executable is placed in /etc/init.d to start Weblogic 11g.


#
# This is to be saved on /etc/init.d
# Permissions chmod 755
# Owned by root:root
# Install using chkconfig --add weblogic
#
#
######################################
### Following parameter is set /etc/profile
# MW_HOME=/u01/app/wls1035/Middleware/

case $1 in
    start)
    su - oracle -c 'nohup /u01/jde_home/SCFHA/bin/startAgent > /dev/null &'
    su - oracle -c 'nohup $MW_HOME/wlserver_10.3/server/bin/startNodeManager.sh > $MW_HOME/wlserver_10.3/common/nodemanager/nodemanager.out &'
    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/startWebLogic.sh > $MW_HOME/user_projects/domains/E1_Apps/servers/AdminServer/logs/AdminServer.out &'
    until nc -z jde91web 8000
      do
       sleep 10
      done

    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/startManagedWebLogic.sh NE1SVR t3://jde91web:8000 > $MW_HOME/user_projects/domains/E1_Apps/servers/NE1SVR/logs/NE1SVR.out &'
        until nc -z jde91web 8016
                do
                sleep 10
        done
        ;;

    stop)
    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/stopManagedWebLogic.sh NE1SVR t3://jde91web:8000 > $MW_HOME/user_projects/domains/"E1_Apps"/servers/NE1SVR/logs/NE1SVR.out &'
    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/stopWebLogic.sh > $MW_HOME/user_projects/domains/E1_Apps/servers/AdminServer/logs/AdminServer.out &'
    ps -ef | grep NodeManager | grep -v grep | awk '{system("kill -9 "$2)}'
    su - oracle -c 'nohup /u01/jde_home/SCFHA/bin/stopAgent > /dev/null &'
    ;;

    restart)
    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/stopManagedWebLogic.sh NE1SVR t3://jde91web:8000 > $MW_HOME/user_projects/domains/E1_Apps/servers/NE1SVR/logs/NE1SVR.out &'
    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/stopWebLogic.sh > $MW_HOME/user_projects/domains/E1_Apps/servers/AdminServer/logs/AdminServer.out &'
    ps -ef | grep NodeManager | grep -v grep | awk '{system("kill -9 "$2)}'
    su - oracle -c 'nohup /u01/jde_home/SCFHA/bin/stopAgent > /dev/null &'
    su - oracle -c 'nohup /u01/jde_home/SCFHA/bin/startAgent > /dev/null &'
    su - oracle -c 'nohup $MW_HOME/wlserver_10.3/server/bin/startNodeManager.sh > $MW_HOME/wlserver_10.3/common/nodemanager/nodemanager.out &'
    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/startWeblogic.sh > $MW_HOME/user_projects/domains/E1_Apps/servers/AdminServer/logs/AdminServer.out &'
    until nc -z jde91web 8000
      do
       sleep 10
      done

    su - oracle -c 'nohup $MW_HOME/user_projects/domains/E1_Apps/bin/startManagedWebLogic.sh NE1SVR t3://jde91web:8000 > $MW_HOME/user_projects/domains/E1_Apps/servers/NE1SVR/logs/NE1SVR.out &'
        until nc -z jde91web 8016
                do
                sleep 10
        done
        ;;
esac

Saturday, May 25, 2013

Zombie and Call object kernel


       How to identify Call Object Kernel Zombie.

   1.  From Server Manager Open Call Object kernel
   2.  See how much CPU / Memory it's using. Make a note.
   3. If you are above 8.98.4 you will see Open JDB Connection and Cache. Make a Note.
   4. Go to bottom, note down the working thread. Ignore Wrk:IDLE thread.  They are not culprit.
   5. Open jde.log and see the cause of Kernel. Note down the last logs when the Kernel went into
       Zombie.
   6.  On Server Manager open all the JDENET_N logs and in one of the logs there would be detail
         about this Zombie
   7. If Stack Report is generated note down that also.
    8. Note down reference from JAS Log.

  Here is an example
   This is a Process Detail of Call Object Kernel went into Zombie. 

    When it went into Zombie :
   1. Cpu Utilized constant 20%
   2. memory utilized between 500 -600 MB
   3.  Open JDB Connection around 30
   4.  JDE Cache Count between 300 - 400
   5. Number of Threads 13
   6. Number of Idle Threads 9 .(Look above figure)
   7. Zombie is cause by  one of Threads
     3944  - SYS:Response Msg Listener
     3396 - Sys Dispatch (Note: This is a starting thread when Kernel Start )
     4724  - SYS:IPC SRV
     3932 -  WRK:SUPTON_055BD830_P4210

The top 3 are SYSTEM Thread and they serve there process individual.  So the Culprit is Thread 3932.

 JAS.Log



Kernel PID  -  1992
Dispatch thread - 3396
Type of Kernel - Call Object kernel
Multi thread
Thread Pool Size- 30
Thread Pool Increment - 10

Logs - 
Exception occurred in thread 3932. 
So 3932 is WRK:SUPTON_055BD830_P4210

This is likely a suspect but not always. There could be chance that some other thread made this culprit. So don't conclude, but make this as a suspect. 

User: SUPTON
Application: P4210

No go more detail - Open all JDENET_N logs and one of JDENET_N there will be detail of Zombie Kernel

 
 



Call Stack
CSALES.dll/UNKNOWN/F4211FSEditLine
CSALES.dll/UNKNOWN/CalculateSOMBFPriceCost
CDIST.dll/UNKNOWN/CalculateSalesPricesAndCosts
CDIST.dll/UNKNOWN/F4072CalculatePriceAdjustments
CCUSTOM.dll/UNKNOWN/GetPrice
CMFG.dll/UNKNOWN/ConfigurationIDCacheFetch

IN JDENET_N call stack is read Top to bottom
So the source of culprit if
CMFG,dll ConfigurationIDCacheFetch

Client Machine: USNEWSPDWL3
Look for log

23 May 2013 16:11:15,627[WARN][RUNTIME]*ERROR* CallObject@15d1951c: COSE#1017 Associated kernel not found. Please see Enterprise Server log for details: host E1PRODENT3:6015(49239) SocID:54906 PID:1992 BSFN:F4211FSEditLine user:SUPTON Env:JPD900
 

23 May 2013 16:11:15,642[WARN][RUNTIME]*ERROR* CallObject@15d1951c: Server problem. The server may still be available, but because of state information, the entire unit-of-work must be resubmitted user:SUPTON Env:JPD900 

Here from these logs
Server name : E1PRODENT3
PID: 1992
BSFN:F4211FSEditLine
User:SUPTON
Env:JPD900

Combining JAS.Log and JDENET.log this can be confimed
This Confirm F4211FSEditLine was a main BusinessFunction and Zombie occured because of CMFG.dll- ConfigurationIDCacheFetch

Note: In KRM we noted that the cache count when zombie occured was between 300-400


Now we go to Call Stack Thread - jde_1992_1369339853_1_dmp.log

 3932 -  WRK:SUPTON_055BD830_P4210


********************************************************
Thu May 23 16:10:53 2013
********************************************************
Generating call stacks for PID 1992 from PID 1992
********************************************************

====> Exception C0000005 ACCESS_VIOLATION occurred in thread 3932 with call stack:
0x7742e41b.! ntdll.dll 
RtlFreeHeap! ntdll.dll 
HeapFree! kernel32.dll 
free! MSVCR100.dll 
_jdeFreeInternal@4! jdel.dll 
_RBTREE_IndexDestroy@4! jdekrnl.dll 
_RBTREE_Destroy@8! jdekrnl.dll 
_JCACH_CacheDestroyInt@4! jdekrnl.dll 
_jdeCacheTerminate@8! jdekrnl.dll 
_ConfigurationIDCacheFetch@12! CMFG.dll 
_jdeCallObjectV2@44! jdekrnl.dll 
_jdeCallObject@40! jdekrnl.dll 
_GetPrice@12! CCUSTOM.dll 
_jdeCallObjectV2@44! jdekrnl.dll 
_jdeCallObject@40! jdekrnl.dll 
_I4500050_CallExtenalProgram@16! CDIST.dll 
_I4500050_CalculateAdjustment@16! CDIST.dll 
_F4072CalculatePriceAdjustments@12! CDIST.dll 

_jdeCallObjectV2@44! jdekrnl.dll 
_jdeCallObject@40! jdekrnl.dll 
_I4201500_CallAdvPricingAdjs@20! CDIST.dll 
_I4201500_CalculatePrices@16! CDIST.dll 
_CalculateSalesPricesAndCosts@12! CDIST.dll 

_jdeCallObjectV2@44! jdekrnl.dll 
_jdeCallObject@40! jdekrnl.dll 
_I4205120_CallCalcPricesAndCosts@76! CSALES.dll 
_I4205120_CalcPricesandCostsLineNotCancelled@80! CSALES.dll 
_CalculateSOMBFPriceCost@12! CSALES.dll 

_jdeCallObjectV2@44! jdekrnl.dll 
_jdeCallObject@40! jdekrnl.dll 
_I4200310_CalcPricesAndCosts@16! CSALES.dll 
_I4200310_F4211FSEditLine@12! CSALES.dll 

_jdeCallObjectV2@44! jdekrnl.dll 
_jdeCallObject@40! jdekrnl.dll 
_JDEK_ProcessCallRequest@24! jdekrnl.dll 
_JDEK_StartCallRequest@16! jdekrnl.dll 
_runBusinessFunction@4! jdekrnl.dll 
_runCallObjectJob@4! jdekrnl.dll 
_psthread_pool_job_execute@4! PSThreadUtils.dll 
?psthread_pool_worker_function@@YGPAXPAX@Z! PSThreadUtils.dll 
?threadFunctionWrapper@@YGPAXPAX@Z! psthread.dll 
0x761733aa.! kernel32.dll 
0x77439ef2.! ntdll.dll 
0x77439ec5.! ntdll.dll 

=====Call stack of thread 3932=====
_LogNTCallStackDump@12! jdel.dll 
?NTUnhandledExceptionHandler@NTExceptionHandler@@CGJPAU_EXCEPTION_POINTERS@@@Z! jdenet_k.exe 
0x761b003f.! kernel32.dll 
0x774774df.! ntdll.dll 
0x77439ec5.! ntdll.dll  


The Dump Call Stack is always read in thread from Bottom to Top.
Here is analysis

B4200310 - F4211FSEditLine@CSALES initiated a crash process
and cache occured _ConfigurationIDCacheFetch@12! CMFG.dll 

In between there are many other Business Function are called.  Note all these.

Conclusion : -
1. Keep Note of all information noted above
2. It is possible the CO Zombie was caused by other business function which no longer running at the point of crash
3. There was High memory and JDE Cache when crash occured.
4. Look for  SAR containing fixes for above pattern.

Result : F4211FSeditLine initiate Crash which occur on ConfigurationIDCacheFetch