Wednesday, March 29, 2017

RDBMS -- nostalgia : What happens when we update a table and commit

Recently, one of the juniors in my team  asked me the following question :

What happens when I update a table and commit the change after a few seconds..
The question was actually a copy&paste from an OCP preparation book and we needed to order the following events in the correct sequence for answering it:

A. Oracle reads the blocks from data file to buffer cache and updates the blocks.
B. Changed blocks from the buffer cache are written to data files.
C. The user commits the change.
D. LGWR writes the changed blocks to the redo log buffer.
E. The server process writes the change vectors to the redo log buffer.
F. LGWR flushes the redo log buffer to redo log files.
G. A checkpoint occurs.

I liked the question and wanted to answer it. I wanted to answer it by providing a paragraph for describing the things that suppose to be happen all the way down from updating the table in memory to writing the modified block in to the datafile.

This was a nostalgia for me.. It reminded me my old times, when I was a Core Dba.

Here is what I wrote;

1) The update statement that we issue from our client application (Toad,sqldeveloper,sqlplus etc) is sent to the Server process (Shadow process or  LOCAL=NO etc)  -- Client+Shadow Process or Client Task + Server Task, always there are 2 tasks. (as the name implies -> TWO TASK)

2) Server process checks the shared pool to see if our statement is already there. Checks the object permisssions and blablabla. (I m skipping this) At this stage; Server Process parses the statement (if not present in the shared pool, hard parse)

3) Server Process loads the data, that we want to update, in to the shared pool , or finds its place in the shared pool, if it is already there.

4) Server process (in the name of our client) updates the data in the buffer cache. (Server process is a reflection of Oracle)

5) As we commit our update almost instantly, LGWR write the relevant redo record from Log Buffer (a memory are in the SGA) to the Redolog file. This is almost a synchronous write, as we commit immediately. At this point, the data is not yet updated in the datafile, but it is written in to the redolog file. (writeahaed logging) .. At this point; if we somehow can read the block directly from the datafile, we see the value stored in the block is still the same( not update, before the update)

6) DBWR writes the changed block to the disk (datafile) in the background, in its own time, with an efficient timing. This is a asynchronous write. This is asynch because of the performance reasons.. If it would be synch, we needed to wait for the seek times in the foreground.. LGWR, on the other hand; requires minimal seek time, as it write synchronously. It writes with OS block size and it writes only the redo vector. (not db IO, not a block wirte)

7) When the checkpoint happens, DBWR writes to the datafiles. (Note, DBWR may write to the datafiles without checkpoints, as well.. DBWR may write to the datafile when the count of dirty blocks in the buffer cache increases and when the usable area for free buffers is downsized)

The important question is this one:
When DBWR writes this dirty (changed) blocks  to the datafile?

DBWR is normally a sleepy process. It sleeps until a triggering event happens. One of these triggering events is generated by CKPT. CKPT awakens the DBWR and makes it flush the dirty blocks from the buffer cache.

If we come back to our question.. I think there is an error in the question.

"LGWR writes the changed blocks to the redo log buffer" --> This is a wrong information. LGWR write from redo log buffer to redolog file.

Aha, sorry. Ignore the sentence above.. There is a note in the original question:
It says: "ignore the actions that are not relevant."

So, we should ignore this wrong info. Well... Then, this question can be answered properly.

On the other hand, the question queries a very specific thing. 

This is a question inside the question..
It wants us to answer this question ->
Which one of the following happens first?
Update in memory, or the Writing the change vector to the log buffer (this is done by the Server process).

Why do I find this as a very specific and detailed thing? 
Because these two events are done almost at the same time. (according to several resources), but the update in the buffer cache seems happening one tick earlier.

The question actually gives us a clue here. The clue is "Oracle reads the blocks from data file to buffer cache and updates the blocks." 
So, it gives "Read and update" in the same sentence, and as the read is the first thing; then there is no need to think about the question above. The "Update" should definitely be earlier than the write that is taken in place in the log buffer. (Note that: Shadow process writes to the log buffer)

The commit should be done after both of these events, because the questions says: commit the change "after a few seconds" , so at least 1 or 2 seconds pass. 

That 's why the correct answer is ->

"A,E,C,F,G,B"

Feel free to comment ...

Wednesday, March 15, 2017

RDBMS Licensing -- CPU / Core limits for Named User Plus licenses

As you may already know, Database licensing (Standard Edition 2 and Enterprise Edition) can be done in two ways.

1) Cpu based licensing. 2) NUP(Named user plus) based licensing.

I will not go in to the details about these licensing methods, because it is not my job or my interest.
However, I want to shed a light on a specific topic, which can be a little confusing.

Although the information that you will find below is on a specific licensing topic, it gives general information about database licensing, as well.

Note that, the information given below is about Enterprise Edition, as we mostly use Enterprise Edition (i.e Oracle EBS databases are Enterprise Edition).

The topic that I want to inform you about, is the CPU limits for Named User Plus licensing.

That is, although Named User Plus licensing is done on the basis of database user count, there is a limit for the Cpu/core as well.

In other words; you can't just buy 25 number of User Named Plus licenses and run your database on a server which has 24 CPU cores(enabled).
(Note that, the Enterprise Edition requires a minimum of 25 Named User Plus per Processor licenses or the total number of actual users, whichever is greater.)

Let's take a closer look at this;

The CPU based licensing for Oracle Database Enterprise Edition is actually done on core-basis.
We count the cores of our database server, then multiply this total physical core count with a core factor (0.5 for Intel CPUs) to calculate the needed CPU/processor licenses for our database environment.

This is also applicable for deriving maximum CPU count that we can have for X number of Named User Plus licenses.

Let's take a look at the following example to strengthen that I just explained.

Suppose you have 50 Named User Plus licenses and want to know the maximum Cpu/core count that you can have with these licenses.

50 user plus can support up to 2 Cpu/core licenses.
These 2 cores  are actually Oracle cores, which should be divided with the processor core factor for deriving the maximum cpu core counts that we can have. (cpu core factor for intel is 0.5)

So, 2  / 0.5 = 4 cpu cores.. Thus, we can say that we can have 4 cores enabled, if we have 50 named user plus licenses.

For instance; if we have 50 named user plus licenses and if we have  an ODA X6-2S , then we should enable only 4 core of it.

Similarly, if we want to enable all the cores of ODA X6-2S ( 10 cores total), then we need to do the following calucataion to calculate the extra licenses that we will need->

10-4=6 -> 6* 0.5 = 3 extra core licenses or 3*25 = 75 extra Named User plus licenses.

Note that, all fractions of a number are to be rounded up to the next whole number. For instance, if we get 1.5 as the result of these calculation, we need to round it up to 2.

References:

http://www.oracle.com/us/corporate/pricing/databaselicensing-070584.pdf

Database Licensing - Oracle

Product Minimums for Named User Plus licenses (where the minimums are per processor) are calculated after the number of processors to be licensed is determined, using the "PROCESSOR DEFINITION".

PROCESSOR DEFINITION:

The number of required licenses shall be determined by multiplying the total number of cores of the processor by a core processor licensing factor specified on the Oracle Processor Core Factor Table.


I wrote this article, because DBAs and Apps DBAs should know these things. At least we as DBAs and Apps DBAs should have some idea about these things; because they are frequently asked by the customers (this is for DBA consultants) and because we need to keep our companies in the safe side. (this is for all the DBAs)

Lastly, sharing the processor core factor table...
Processor Core Factor table (current one -- may be updated in the future)

Thursday, March 9, 2017

OAM -- EBS Home Page, login error, unexpected error, throubleshooting with Http Trace.

This is not the first unexpected problem that I have encountered during EBS and OAM implementations.
Yes.. This blog post will be about an issue that I have encountered after integrating an EBS 12.1 instance to OAM+OID environment.

I m writing it, because I want you to be aware of the diagnostics that can be done in such situations.

Let's start to our real life story...

I integrated EBS 12.1 successfully to the OAM and I could able to link our EBS users to OID accounts using EBS auto link feature.
However, after authenticating our users, I have ended up with an unexpected Error just after OA Home Page redirection.

The error that I encountered in EBS Home Page was as follows;


Yes, the error I was getting, was on a confirmation popup and yes it was in Turkish Language.. (actually later I realized that it could not be changed as it was statically written in Turkish language)

The error I was getting can not be found anywhere. (Oracle Support , or anywhere on the internet)
There were no errors in Webgate, AccessGate, EBS oacore, OAM managed server or OID managed server logs.
I was stuck. No clues in the server logs, no problems reported in anywhere...
At that point, I decided to get a HTTP trace on our client.
I dowloaded and installed Fiddler (https://www.telerik.com/download/fiddler) and started tracing. I reproduced the error and look what I 've found there in the Fiddler's trace file.


Well... I clicked on the page url listed in Fiddler, then I checked the Textview tab and saw the same error message written there.. The error message that I was getting in the EBS Home Page...

The error message was written inside a script and that script was clearly a custom one which was basically added to the standard code.

The script was written to check the window.name and raise error accordingly.

The first thing that came to my mind was the personalizations. Some early developer must have added this script to the EBS login page, and that script must not have been compatible with the OAM login.

In order to be sure, I disabled all the personalizations by setting Disable Self-Service Personal profile to Yes and retried the login.
That was it! I could login without any problem. I could even logout without any problems :)
At the end of the day, I forwarded this problematic personalization to the development team, as it was required to be modified.

Well...
You see what a little customization can do?
You see how a simple http trace can save our day? ( or let's say Http Web Debug)
You see the things that being an Apps DBAs requires? ( I mean the ability to narrow down the issue, choosing the right tool in the right time & place, the ability to learn and use any tool, when it comes to EBS diagnostics..)

That's it for now. See you in my next articles.

Friday, March 3, 2017

VNCR (Valid Node Checking for Registration), as an alternative for COST, CVE-2012-1675, a real life story.

Recently recommended VNCR (Valid Node Checking for Registration) for a customer RAC environment which was affected by Oracle Security Alert named CVE-2012-1675.

Reference:

The vulnerability was identified as the TNS listening poisioning, and the Oracle's suggestion was to use Class of Secure Transport (COST) to restrict instance registration.

Reference: 
  • Using Class of Secure Transport (COST) to Restrict Instance Registration in Oracle RAC (Doc ID 1340831.1)
However, we wanted to have a quick solution and at that moment; I recommended using the VNCR to restrict the nodes which can be registered to the RAC listeners (local and scan listeners)

This way; listeners will be able to prevent the remote instances and remote codes to register, thus we can protect the system indirectly at a certain level, without implementing the COST.

References:
  • How to Enable VNCR on RAC Database to Register only Local Instances (Doc ID 1914282.1) 
  • Valid Node Checking For Registration (VNCR) (Doc ID 1600630.1)
The implementation of VNCR was simple.. We just added the following lines to the listener ora files. (In this RAC environment, both Scan and local listeners were using the same listener.ora files, which were located in GRID Home, as recommended for RAC instances >= 11gR2)

VALID_NODE_CHECKING_REGISTRATION_LISTENER=ON
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN1=ON
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN2=ON
VALID_NODE_CHECKING_REGISTRATION_LISTENER_SCAN3=ON
REGISTRATION_INVITED_NODES_LISTENER_SCAN1=(<Node1'spubichostname>,<Node2pubichostname>)
REGISTRATION_INVITED_NODES_LISTENER_SCAN2=(<Node1'spubichostname>,<Node2pubichostname>)
REGISTRATION_INVITED_NODES_LISTENER_SCAN3=(<Node1'spubichostname>,<Node2pubichostname>)

Note that,  In RAC, remote listeners should be registered by all the RAC nodes, but the local listeners should be registered only by their local nodes.. 
So we didn't declared any invited nodes for Local listener, as we wanted local listeners to be registered only from the local nodes. 
(Setting VALID_NODE_CHECKING_REGISTRATION_LISTENER=ON is enough for that..!)

After adding the lines (seen above) to the listener.ora files, we restarted the scan and local listeners and that's it. (we could actually reload the scan and local listeners)

Following is a proof for VNCR. It is working..

Here, I m implementing the VNCR in the remote listener(scan), which is running on Node 1. Note that, I m not adding Node 2 to the invited nodes list. As a result, only node 1 can register the scan listener, as you see below,  ->

[oracle@erm01 admin]$ lsnrctl status LISTENER_SCAN2

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 03-MAR-2017 08:38:38
Copyright (c) 1991, 2013, Oracle. All rights reserved.
Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
STATUS of the LISTENER
------------------------
Alias LISTENER_SCAN2
Version TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date 03-MAR-2017 08:38:01
Uptime 0 days 0 hr. 0 min. 37 sec
Trace Level off
Security ON: Local OS Authentication
SNMP OFF
Listening Endpoints Summary...
(DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=11.11.11.211)(PORT=1521)))
Services Summary...
Service "ERM" has 1 instance(s).
Instance "ERM1", status READY, has 1 handler(s) for this service...
Service "ermXDB" has 1 instance(s).
Instance "ERM1", status READY, has 1 handler(s) for this service...
The command completed successfully

Here I set the invitied nodes for adding the node 2 to the invited nodes list, and now I see the instance in node2 is registed to the LISTENER_SCAN2 , as well ->

[oracle@erm01 admin]$ lsnrctl status LISTENER_SCAN2
LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 03-MAR-2017 08:37:33
Copyright (c) 1991, 2013, Oracle.  All rights reserved.
 Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=LISTENER_SCAN2)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_SCAN2
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                03-MAR-2017 08:36:38
Uptime                    0 days 0 hr. 0 min. 54 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=LISTENER_SCAN2)))
(DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=11.11.11.211)(PORT=1521)))
Services Summary...
Service "ERM" has 2 instance(s).
  Instance "ERM1", status READY, has 1 handler(s) for this service...
  Instance "ERM2", status READY, has 1 handler(s) for this service...
Service "ermXDB" has 2 instance(s).
  Instance "ERM1", status READY, has 1 handler(s) for this service...
  Instance "ERM2", status READY, has 1 handler(s) for this service...
The command completed successfully

Well, this is the story of the day guys :). I just did this configuration 2 hours ago and here I m writing it :) I hope you will find it useful.