Wednesday, October 18, 2017

EBS 12.2 -- oacore server start problem -- java.util.zip.ZipException: error reading zip file

We encountered a strange problem in an EBS 12.2.6 environment, built on Solaris 11 sparc servers.
The problem started after the Dba restarted the application services.
The problem was directly related with oacore..
oacore_server1 and oacore_server2 could not be started. (it was a multi node apps tier environment, built on shared appl_top)

While, all the other managed servers(like forms) and the Admin Server could be started without any problems, oacore servers could not.

Oacore managed servers could not be started because of the following error;

java.util.zip.ZipException: error reading zip file
at java.util.zip.ZipFile.read(Native Method)
at java.util.zip.ZipFile.access$1400(ZipFile.java:56)
at java.util.zip.ZipFile$ZipFileInputStream.read(ZipFile.java:679)
at weblogic.utils.io.DataIO.readFully(DataIO.java:351)
at weblogic.utils.io.DataIO.readFully(DataIO.java:328)
at weblogic.utils.classloaders.ZipSource.getBytes(ZipSource.java:76)
at weblogic.utils.classloaders.GenericClassLoader.defineClass(GenericClassLoader.java:330)
at weblogic.utils.classloaders.GenericClassLoader.findLocalClass(GenericClassLoader.java:302)
at weblogic.utils.classloaders.GenericClassLoader.findClass(GenericClassLoader.java:270)
at weblogic.utils.classloaders.ChangeAwareClassLoader.findClass(ChangeAwareClassLoader.java:64)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at weblogic.utils.classloaders.GenericClassLoader.loadClass(GenericClassLoader.java:179)
at weblogic.utils.classloaders.ChangeAwareClassLoader.loadClass(ChangeAwareClassLoader.java:43)
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)

This is an undocumented and an interesting real life case.

I will give the cause and the solution shortly, but first let's look at what we did to correct the problem, or let's say; find the underlying cause of this problem.

It was obvious that during the start of the oacore servers, oacore application was deployed by EBS. ( weblogic)

The problem was on this deploy.. 

During the deployment, some zip files could not be read! (it may be a zip file, jar file or war file)

The error stack was saying these above things but it didn't give us the name of that problematic file.
  • So, we enabled debug on WLS. We enabled debug for Deployer as well.
Enabling debug:
Environment > Servers > MyServer > Debug > weblogic
Then, enable the level of debug you need, e.g.: Deploment.
This change does not require WebLogic Server Restart.
Make sure the severity is set to debug in Weblogic console:
Environment > Servers > MyServer > Logging >Advanced > Minimum severity to log: Debug

Even after enabling the debug, the name of the problematic file could not be determined.
  • We executed ChkEBSDepencencies to ensure that there is no dependency failure.
$FND_TOP/bin/txkrun.pl -script=ChkEBSDependecies -server=ALL_SERVER

This was successful.. So dependencies were not the cause.
  • We exucted truss to see which jar/zip file is having issues.
 truss -daefo /tmp/to_erman.log admanagedsrvctl.sh start oacore_server1

However, truss didn't give us the name.. (or the output file was so big to be make good analysis)
  • Modified the ulimits (especially hard and soft files and process limits). Again, not fixed.
  • Tried to create a new oacore server and work around the problem in case, it could be related with a specific oacore_server using "$AD_TOP/patch/115/bin/adProvisionEBS.pl ebs-create-managedserver"
However, this command failed as it tried to start the new server once it was created and, that new managed server (lets say oacore_server2) failed with the same zip error!  So this wasn't the solution or the workaround.
  • Checked the AD and TXK patch level, but they were already high..
SQL> select ABBREVIATION, NAME, codelevel FROM AD_TRACKABLE_ENTITIES where abbreviation in ('txk','ad');

ABBREVIATION NAME CODELEVEL
ad Applications DBA C.9
txk Oracle Applications Technology Stack C.9
  • Did the following things as instructed by Oracle Support: (altough I found them unrelated with our zip issue)
1. Set SITE level profile option "FND: Disable Inline Attachments" (FND_DISABLE_INLINE_ATTACHMENTS) to a value of "TRUE"
2. Re-start EBS middle tier services to ensure the profile option change is picked up
3. Monitor for any further recurrence of the issue

1.set s_jdbc_connect_descriptor_generation parameter to TRUE on the Target instance
2. Run autoconfig for the affected parameters to reference Target instance
3. Re-test issue

As I expected, these moves didn't solve the issue.

Okay.. Let's see how I found the problematic file and how I fixed the issue ->>

After trying the attempts above, I decided to regenerate the Jar files using adadmin.

I knew that those jar files were used by oacore servers, but I wasn't expecting that there were zip files used during the deployment /start of oacore_server + I didn't expected the same zip files were used when we run the regenerate jar files using adadmin..

So, I executed the adadmin and tried to relink the jar files.
adadmin failed with error, so I checked the adadmin.log file.

There it was..!  the I/O errors...

ERROR: I/O error while attempting to read /u01/app/fs1/EBSapps/comn/java/lib/DnBGlobalAccess.zip

ERROR: I/O or zip error while attempting to read entry oracle/dss/dataView/AdornmentLayout.class in zip file /u01/app/fs1/EBSapps/comn/java/lib/bipres.zip

ERROR: I/O or zip error while attempting to read entry oracle/apps/edr/security/server/EdrVpdRuleEOImpl.class in zip file /u01/app/fs1/EBSapps/comn/java/classes

So, the zip and some class files in the $JAVA_TOP could not be read due to I/O errors.

After seeing these errors, I diretly jumped to the filesystem and tried to copy those problematic files using cp command.

I/O Errors, again !! Solaris could not copy them due to I/O errors..

So, the files were corrupted on OS/Storage layer, on filesystem layer.. ( I sent this info to the OS team and requested a host and fs check from them)

What I did for the fix was simple;

I renamed those files and copied them from the patch filesystem. (checked patch fs, these files were identical as the run filesystem)

Copy was successful.. So the files in patch fs were not corrupted .

After copying them from patch fs, I executed the adadmin again. (generate jar files)
This time, it successfuly completed.

After that, I started the services using adstrtal.sh

This time, oacore_server1 and oacore_server2 could succesfully started!!

So, at the end of the day, I spent almost 6 hours to solve this.. 
No sleep during the diagnostics work!

Unfortuneatly, the issue was undocumented and there was no method to see the problematic zip file other than executing adadmin regenerate jar files..

Anyways, I hope you find this post useful.

Wednesday, October 11, 2017

Oracle Database Appliance / ODA X7-2 released!

Oracle released ODA X7-2, new generation of the ODA machine. This new ODA has more cpu cores, more processing power and more disk capacity than the former, ODA, ODA X6-2.

What is more interesting than these improved system resources is, that, ODA X7-2 will support Standard edition databases even in its HA model!

We will have S, M and HA model in ODA X7-2. So, there is no ODA Large (L) model in ODA X7-2 family..
I think, we will see these machines in several customer environments, in the folowing days..
I'm already excited about it :)

You can read more on :

http://www.oracle.com/technetwork/database/database-appliance/overview/index.html
https://www.oracle.com/engineered-systems/database-appliance/x7-2m/index.html

ODA X6-2M -- virtualization with KVM -- a real life example and my first thoughts

Recently, created a virtualized environment in ODA X6-2M.
I used Kernel Based Virtual Machine for virtualizing this new ODA Medium Model, as instructed by Oracle.

The machine that I worked was like the following ->

[root@odax6 ~]# odacli describe-component
System Version  
---------------
12.1.2.11.0

Component                            Installed Version    Available Version   
---------------------------------------- -------------------- --------------------
OAK                                      12.1.2.11.0               up-to-date          
GI                                         12.1.0.2.170418       up-to-date          
DB                                        11.2.0.4.170418       up-to-date          
ILOM                                   3.2.7.26.a.r112632   3.2.9.23.r116695    
BIOS                                    38050100                 38070200            
OS                                        6.8                           up-to-date  


[root@odaX6 ~]# odacli describe-appliance

Appliance Information                                           
---------------------------------------------------------------- 
                     ID: xxxxxxxxxxxxxxxxxxxx
               Platform: OdaliteM
        Data Disk Count: 2
         CPU Core Count: 20

                Created: August 22, 2017 1:19:18 PM EET

The OS of this ODA machine was Oracle Linux 6.8.
I want to call it as the new ODA, but ODA X7-2 is just released :) It is hard to keep up with this ODA family :)

Anyways, ODA X6-2M is not configured with KVM out of the box.
So, I needed to make the KVM enablement of this environment.
I must admit that, it was pretty easy to enable KVM on this machine.

I just started the libvirtd and installed the virt-manager, which is the GUI of ODA.

[root@odax6 ~]# service libvirtd start
Starting libvirtd daemon: 
[root@odax6 ~]# service libvirtd status
libvirtd (pid  8943) is running...

[root@odax6 yum.repos.d]# wget http://yum.oracle.com/public-yum-ol6.repo
[root@odax6 yum.repos.d]# yum install virt-manager

That was it, the KVM enablement was done!.

After this point, I continued with the storage pool and KVM network configurations.

In order to configure/create the storage pool; I first created an ACFS volume using asmca ->

[grid@odax6 asmca]$ asmca -silent -createVolume -volumeName kvm_repo1 -volumeDiskGroup DATA -volumeSizeGB 300 -sysAsmPassword welcome1
[grid@odax6 asmca]$ asmcmd volinfo -G DATA kvm_repo1 | grep -oE '/dev/asm/.*'
/dev/asm/kvm_repo1-33

Then, I created the ACFS filesystem on top of it and mounted it using a single command; --again using asmca silently ->

[grid@odax6 asmca]$ asmca -silent -createACFS -acfsVolumeDevice /dev/asm/kvm_repo1-33 -acfsMountPoint /kvm_repos/kvm_repo1

ASM Cluster File System created on /dev/asm/kvm_repo1-33 successfully. Run the generated ACFS registration script /u01/app/grid/cfgtoollogs/asmca/scripts/acfs_script.sh as privileged user to register the ACFS with Grid Infrastructure and to mount the ACFS. The ACFS registration script needs to be run only on this node: odax6.

-- needed to run acfs_script.sh using root as a part of this ACFS creation.

[root@odax6 ~]# sh /u01/app/grid/cfgtoollogs/asmca/scripts/acfs_script.sh

ACFS file system /kvm_repos/kvm_repo1 is mounted on nodes odax6

Later on, I checked my mounts and saw that the new ACFS is there.

root@odax6 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroupSys-LogVolRoot
                       30G   21G  7.7G  73% /
tmpfs                 126G  631M  126G   1% /dev/shm
/dev/sda1             477M   46M  406M  11% /boot
/dev/mapper/VolGroupSys-LogVolOpt
                       59G   11G   46G  18% /opt
/dev/mapper/VolGroupSys-LogVolU01
                       99G   21G   73G  23% /u01
/dev/asm/dattest-33   100G  1.7G   99G   2% /u02/app/oracle/oradata/test
/dev/asm/reco-481     149G  5.2G  144G   4% /u03/app/oracle
/dev/asm/commonstore-33
                      5.0G   49M  5.0G   1% /opt/oracle/dcs/commonstore
/dev/asm/kvm_repo1-33
                      300G  648M  300G   1% /kvm_repos/kvm_repo1

After the creation of the repo, I started the repo and also made it autostart.

[root@odax6 ~]# virsh pool-start kvm_repo1
Pool kvm_repo1 started
[root@odax6 ~]# virsh pool-autostart kvm_repo1
Pool kvm_repo1 marked as autostarted

Then checked it to see whether it is there and whether its size and everything were configured properly.

Note that virsh is a command line management tool for KVM.

[root@odax6 ~]# virsh pool-info kvm_repo1
Name:           kvm_repo1
UUID:           97dda9d1-ca6b-c9e2-bbfc-901cc2274898
State:          running
Persistent:     yes
Autostart:      yes
Capacity:       300.00 GiB
Allocation:     647.62 MiB
Available:      299.37 GiB

[root@odax6 ~]# virsh vol-list --pool kvm_repo1
Name                 Path                                    
-----------------------------------------
lost+found           /kvm_repos/kvm_repo1/lost+found  


At this point , my storage pool and volumes were configured properly.

Continued with the network stack..

I needed to arrange a network, a virtual interface for the virtual machines that would reside on this ODA environment. (The virtual machines in my case, were EBS Application Tier nodes. )
I had 2 option. (actually 3 , if we count networking with MacVTap..)

Anyways, the first option was Nat forwarding. Nat forwarding could not be used in my case, because the network of ODA X6-2M and the network of the virtual Application tier nodes(that would reside on ODA X6-2M) were the same. Their IPs were from the same block , so I had to use the other option, which was the Bridged networking ("shared physical device").

This method was actually the full bridging, which could let the guest (EBS Application tier nodes in my case) to be able to connect directly to the LAN.

In order to configure these network things , I used the virt-manager. 
However, virt-manager had some fonts problems, so I needed to fix them first.
Here is a little info, and the fix for it:


virt-manager is management interface that eases the administration of the KVM environment (in ODA or in anywhere else) It is called  Virtual Machine Manager and it is executed using the command virt-manager (using root).
As it is a GUI, it needs a X environment to run it.
In Oracle Linux world, as you may also agree, we mostly use vncserver for displaying the X screens remotely.
So, we connect to the vncserver (or we can use ILOM remote connection or anything that does the same thing) and execute the virt-manager to start the Virtual Machine Manager for KVM.
The issue starts here.
After the deployment of ODA and enabling the KVM, we run the virt-manager command and we see the garbage characters.
We actually see little squares rather than the characters and fonts.
So, in order to fix this, we basically need to install the fonts that Virtual Machine Manager needs.
A simply yum command can do this work and this little piece of information may save you time :)
Fix: yum install dejavu-lgc-sans-fonts

Well.. After the fix, I could use the virt-manager without any problems.

So in order to configure the vm network; I did the following;

Opened virt-manager.

Connected to the KVM environment.

Created a bridge named br1 on btbond1 and activated it directly. (using the network interface tab)

--I used this bridge for multiple machines. (I had 2 apps Vm machines on ODA, so their virtual NICs are based on this bridge called br1)

All done from GUI (virt-manager) and that was it..

My KVM network was configured.

The last thing to do was, creating my virtual machines for my EBS Apps nodes and installing the Operating Systems (in my case, Linux) on them.

The virtual machine creation and OS installation was extremely straight forward.

Again , I used the virt-manager.

In order to create a virtual machine and configure it to be booted with OS installation media, I did the following ->

I clicked the "Create a new virtual machine" button to open the new vm wizard

Specified the installation type "Local install media (ISO image)" -- clicked next:)

Located the ISO image, Configure OS Type and Version (linux, Redhat 6 in my case, I already downloaded the OS installation ISO and placed it into ODA X6-2M earlier.) -- clicked next:)

Configured CPU and memory -- clicked next:)

Configured the VM's local disks and their sizes. (on the ACFS volume created previously) --clicked next:)

Lastly, selected the network device : (br1 - the bridge in my case) -- this case clicked the Finish button:)

I did these things 2 times, because I had to have 2 apps virtual machines on ODA.

After creating the virtual machines, I started them using virt-manager and they were booted with Oracle Linux 6 installation media. I used the console that comes with the virt-manager to install the OS and then directly started using the Apps Nodes without any problems. (after configuring the network, their IPs, ofcourse)



At the end of the day, I got myself a virtualized ODA X6-2M.
This virtualization was a little different than the Oracle VM Server virtualization that we had in the earlier releases of ODA.

In ODA X6-2M, we use KVM ... So, there is no ODA_BASE, we just place our databases directly on ODA nodes and create our guest machines for the Apps Tier nodes.

In short, apps nodes are running on VMs on top of KVM, and databases are running directly on ODA nodes. (so they are running on Bare Metal) .. ("as instructed by Oracle")

We still have capacity on demand, both for databases and virtual machines.



This was an interesting work for me.. After all these years dealing with Oracle VM Server, I configured a new virtualized ODA with a different virtualization technology. Anyways, I liked it and found it as a good and an easy virtualization solution.

We will also see its performance in the couple of days...

Monday, October 9, 2017

About my tech reviews

I like doing tech reviews and already did a couple of them in ITCentralStation.com. (I reviewed Exadata and Oracle Linux, a few months ago..)


Recently, ITCentralStation.com sent me my Top 5 contributors badge and it reminded me that, it is the time for making another review :)

Currently, I m considering to review ODA X6 , ODA KVM or Oracle EBS 12.2, but we'll see..
Once it is ready, I will update you with the link...