Repeated DBSNMP Authentication Failures

I've recently started a new job as an Oracle DBA!  Long story for another day.

As part of taking over control the new database, I spent some time monitoring the authentication failures in the audit log.

On this particular (production!) database, this is what I was seeing:
User Name OS User Name Date Time
DBSNMP oracle 27-May-14 15.14.56
DBSNMP oracle 27-May-14 15.13.56
DBSNMP oracle 27-May-14 15.12.55
DBSNMP oracle 27-May-14 15.11.55
DBSNMP oracle 27-May-14 15.10.55
DBSNMP oracle 27-May-14 15.09.55

And on and on and on.  Every minute a failed connection.  OK maybe not a problem given that the DBSNMP account has been set up by the previous DBA with UNLIMITED connection attempts, but definitely an indication that something is wrong.

I went through the following troubleshooting:
  • Determined where the connections were coming from (localhost)
  • Checked if there were any cron jobs running at these times (no)
  • Checked if OEM/DB Control was running (yes)
  • Checked whether OEM/DB Control was configured with the correct username/password (it was)
  • Removed the password from targets.xml in $DBCONTROL_HOME/sysman/config/ and reconfigured it through the interface, testing as it went on
  • Checked if there were any jobs in the database or OEM configured incorrectly (none)
  • Scanned through every other process (using ps -ef | less) to see if something else was running that looked out of place
  • Stopped dbconsole to see if the connection attempts stopped ($ORACLE_HOME/bin/emctl stop dbconsole) - they didn't
It was at this point, after restarting dbconsole, that I noticed that there were two emagent processes running which didn't seem right, and on closer inspection one of them had been running since 2013, despite my restart of dbconsole:

PRIMARY-/u1/app/oracle/product/11.2.0/dbhome_1/bin> ps aux | grep emagent
oracle   24731  0.0  0.0 929580  6580 ?        Sl    2013 147:46 /u1/app/oracle/product/11.2.0/dbhome_1/bin/emagent
oracle   26818  0.1  0.2 1327364 43000 pts/3   Sl   14:39   0:03 /u1/app/oracle/product/11.2.0/dbhome_1/bin/emagent
oracle   36234  0.0  0.0 103248   864 pts/3    S+   15:15   0:00 grep emagent
Looking at the output from emctl status agent, I could see which agent process should be running...
PRIMARY-/u1/app/oracle/product/11.2.0/dbhome_1/bin> ./emctl status agent
Oracle Enterprise Manager 11g Database Control Release 11.2.0.3.0
Copyright (c) 1996, 2011 Oracle Corporation.  All rights reserved.
---------------------------------------------------------------
Agent Version     : 10.2.0.4.4
[snip]
Agent Process ID  : 26818
Parent Process ID : 26766
I killed the other process (24731) and voila, the connection attempts stopped.

Comments

Popular posts from this blog

Data Guard with Transparent Application Failover (TAF)

RMAN-05531 During RMAN Duplicate from Active Data Guard Standby

Data pump - "ORA-39786: Number of columns does not match between export and import databases"