This document describes programming problems that have occurred in Aida development, together with their solutions, as a reference for future development.
This section describes problems that may occur at Aida run-time.
General Runtime, not specific to client or server.
Problem: At server image activation, the following
errors are reported:
Error [125] in bind() call! err:: Address already in use Socket transport failed to init. Transport dt_socket failed to initialize, rc = -1. FATAL ERROR in native method: No transports initialized |
Cause: Another process on the same host is already using the port that the aida server wants to use. Aida server ports are assigned in their *Server.conf file, so that Aida servers can be made "persistent" CORBA servers. |
Solution: Search port assignments reported by
netstat or lsof for the port assinged to the Aida server in its *Server.conf file. In the following case a process CaRepeater had stolen the port. Then arrange to change the port of the Aida
server, or the other process, so there is no longer a conflict.
[mccas0]:perf/common/tool> more ${CD_CONFSYS}/${AIDA_CA_NAME}.conf ... ooc.orb.oa.endpoint=iiop --port 58996 [mccas0]:perf/common/tool> lsof -i -P | grep 58996 caRepeate 906 cddev 11u IPv4 0x301....56e8 0t0 TCP *:58996 (LISTEN) caRepeate 906 cddev 12u IPv4 0x301....0628 0t170 TCP somehost.slac.stanford.edu:58996->somehost.slac.stanford.edu:50797 (BOUND) |
Problem: At image activation, the following
exception is thrown:
Exception java.lang.NoSuchMethodError:
org.omg.PortableInterceptor.IORInterceptorOperations.components_established(Lorg/omg/PortableInterceptor/IORInfo;)V |
Cause: ORBACUS non-compliance with Java 1.4 |
Solution: Recompile your code under Java 1.3. On Solaris this means just "setenv JAVAVER 1.3" prior to compilation. |
Problem: Exception java.lang.NoClassDefFoundError |
Cause: The run-time environment can't find a class |
Solution:
|
Problem: Exception org.omg.CORBA.BAD_PARAM |
Cause: The package name with which the client was compiled did not match the package name with which the server was compiled. In this case the server was just defined as in "package Slc", but the client had been built using CORBA client side code generated from IDL in which the full package name "edu.stanford.slac.aida.slc" was used. This caused the id() check in the <interface>.narrow( OR ) call to fail, which throws CORBA.BAD_PARAM. |
Solution: Make sure the package names match. This may mean changing the IDL and re-IDL compiling the server, and restarting it so that it inserts a corrected Object Reference (which includes the package name) into the Stringified Object Reference (SOR) it publishes. The package name should be specified in the jidl compile line, and there should be no #prefix directive in the IDL file (which overrides --prefix-package): Eg jidl --prefix-package da edu.stanford.edu.aida idl\da.idl |
Problem: ORB<property> unknown message issued by CORBA.ParseArgs |
Cause: You are using a version of the orb which does not recognize the property. Perhaps property has become deprecated, or is not you are using an old orb that doesn't recognize the property. |
Solution: Check valid properties for orb for the version of the ORb you are using. Check you are using the version of the ORB you think you are. |
Problem: An IMR client, like the IMR Console, can't start a server. |
Symptom: The server may appear to start momentarily, but then goes away. |
Cause: bat file that starts the server produces output. |
Solution: Put a "@echo off" at the top of the bat file |
Cause: The CLASSPATH under which the server is started remotely, is incorrect or incomplete. |
Solution: Check the CLASSPATH on the host process running server. Note that, for NT based servers being run by the IMR, the CLASSPATH which is used is the CLASSPATH set from the System control panel, not any CLASSPATH set by the bat file used to start the server by the IMR such as Start<servername>Server.bat. If you change the required CLASSPATH of a server executed remotely by the IMR, you must change the CLASSPATH on the host of every oad that may run the server, and you must restart IMR in a new window so that it gets the new CLASSPATH. |
Symptom:
[ IMR: exec: ca: success ] |
Cause: The server MAX
SPAWN Count has been exceeded. In particular, check that an instance of
the server isn't already running on the host as a background process which
is not showing up in the IMR console for whatever reason.
Diagnose on unix with: slcs1> ps -ef | grep greg, looking for for instance java jdk's running in the background. |
Solution: kill all the background processes of the server already running before attempting to restart a new instance from the IMR. E.g. slcs1> ps -ef | awk '/greg.*jdk1/ {print "kill "$2}' > killgreg.sht |
Problem: Cant remove a non-existent server from the IMR. |
Symptom: If you stop an OAD which the IMR thinks is running a server which in fact not running (for whatever reason), then you can't "Stop..." or "Delete..." the server from the IMR console. You get contradictory messages, you can't stop the server because it says its not running (which is true) and you can't delete the server because the IMR says its running! |
Cause: Unknown |
Solution: Restart the OAD on the host. This prompts the IMR to update its database, and mark the server "not-running". |
Problem: A restarted IMR can't make contact with OAD on some host |
Symptom: After restarting the IMR, say to change from administrative mode to non-admin, the IMR cannot contact the OAD on some host, and issues "warning: IMR: Could not contact OAD at: corbaloc::<host>:/OAD |
Cause: Unknown |
Solution: Restart the OAD on the host. The OAD restart does successfully contact the IMR and tell it its running. |
Problem: OAD cannot be contacted by IMR after IMR restart |
Cause: Unknown. Of course check that the ooc.imr.*port* settings match in both OAD and IMR conf files. |
Solution: Restart the OAD on each host which could not be
contacted after restarting the IMR. Ie, on each host, issue a command of
the form:
imr -ORBconfig %AIDASCRIPT%\oad.conf |
Problem: imradmin operation hangs (never completes) |
Cause: Unknown. |
Solution: check that the IMR process itself doesn't need a RETURN typed to it! |
Problem: Client can't contact IMR |
Symptom: exception message "IMRDomain not currently reachable" |
Cause: ports don't agree on client and server |
Solution: Check port specification, perhaps add 10000 port number spec. |
Problem: java.lang.UnsatisfiedLinkError
MCCDEV> java "edu.stanford.slac.aida.slc.Server" java.lang.UnsatisfiedLinkError: no CorbaDBShr in java.library.path at java.lang.ClassLoader.loadLibrary (ClassLoader.java:1325) (pc 343)
|
Cause: CorbaDBShr was not defined |
Solution: Run @SETLOGICAL.COM to do the define/log CorbaDBShr. Make sure CorbaDBShr exists where the logical points to. |
Problem: java.lang.UnsatisfiedLinkError
MCCDEV> java "edu.stanford.slac.aida.slc.Server" Server ready Init called! java.lang.UnsatisfiedLinkError at edu.stanford.slac.aida.slc.SlcI_impl.DbInit at edu.stanford.slac.aida.slc.SlcI_impl.Init (SlcI_impl.java:47) (pc 27)
|
Cause: A JNI routine couldn't be resolved at runtime because the VMS shareable in which it was defined hadn't been rebuilt with the correctly defined UNIVERSAL name for the JNI routine. In this case, the long package name "edu.stanford.slac.aida" had caused the fully qualified name of the JNI routine dbInit to be longer than 31 characters. Names longer than 31 chars are automatically shortened by the Java and javah compiler down to 31 characters. Care must be taken to compile the .c source code implementing the JNI routines with the correct qualifier (/name=(shortened, as_is) to make sure the C compiler produces the same shortened symbol name, and to use SCAN_GLOBALS_FOR_OPTION.COM to build a .OPT file which correctly defines the UNIVERSAL symbols when linking the shareable. |
Solution: First check that the name of the called function as defined in the .c file matches the name defined in the output of the javah compiler (the .h file) in length and case exactly. If that doesn't solve it, check that the name is <31 characters, and if its longer, that the correct, shortened, names are defined in the shareable being called. |
Problem: server never gets up and says "Server up and ready" |
Cause: Some other process is holding the Oracle db lock while the SLC server is trying to put its OR into the db. |
Solution: Find the process holding the Oracle db lock, and stop it. Eg check SQLPlus is not holding lock on interfaces table. |
Problem: command line in shell script appears to have become garbled. |
Cause: 1) Unix (on slcs1
anyway) seems to have a limited command line buffer, possibly 512
characters. Check that after variable expansion the line that must be
interpreted is still < 512 characters.
2) Also check whether any environment variable has used has a hidden character, perhaps a CR at the end do to being defined in a script that was mistakenly edited on NT before being executed on Unix! |
Solution: shorten number of characters in command line. Eg, remove -classpath and use CLASSPATH env variable instead. |
Problem: cvs commands causing unexpected results or failing to take effect |
Cause: AFS token expired |
Solution: Acquire new AFS token, and re-issue CVS command |
Problem: java -jar <jarfile> command results in "Failed to load Main-Class manifest attribute" |
Eg: |
Cause: The format of the contents of the manifest template file which specifies the main class is very particular. There must be a <CR> at the end of the line, even if there is only one line (the one containing the name of the main class). |
Solution: Add a <CR> to the end of the Main-Class line. |
Problem: jar file packaging command, given with the m option to include a given manifest template file that specifies a main class, results in "java.io.IOexception:invalid header field name: Main-Class |
Eg: |
Cause: The Main-Class attribute in the mainfiest template file, in the above case called MainClass.txt, had a space before the ":" (!) |
Solution: Remove the space between "Main-Class" and ":" in the manifest file. |
Problem: Can't find a class. |
Eg MCCDEV> javac -classpath ".;udslc/greg/dev/aida" @allslc.list edu/stanford/slac/aida/slc/SlcIPOA.java:21: Superclass org.omg.PortableServer.Se rvant of class edu.stanford.slac.aida.slc.SlcIPOA not found. extends org.omg.PortableServer.Servant |
Cause: Incorrect CLASSPATH, or JAVA$CLASSPATH |
Solution: Don't override classpath unless you're sure JAVA$CLASSPATH is wrong or incomplete. Check the logical JAVA$CLASSPATH, in the JOB table. Eg just MCCDEV> javac @allslc.list |
Problem: Wrong or no package name alert |
Eg error: File ./edu/stanford/slac/aida/slc/SlcI_impl.class does not contain type e du.stanford.slac.aida.slc.SlcI_impl as expected, but class Slc.SlcI_impl. Please remove the file, or make sure it appears in the correct subdirectory of the cla ss path. edu/stanford/slac/aida/slc/Server.java:16: Class edu.stanford.slac.aida.slc.SlcI _impl not found. SlcI_impl slciImpl = new SlcI_impl(); ^ |
Cause: In the .java file the "package" directive at the top of the file was wrong, it read just "package Slc", not "package edu.stanford.slac.aida.slc;" |
Solution: Correct the package name in the package directive at the top of the .java file. |
Problem: Undefined Symbols at link |
Eg MCCDEV> @buildclib %LINK-W-NUDFSYMS, 3 undefined symbols: %LINK-I-UDFSYM, Java_edu_stanford_slac_aida_slc_SlcI_1impl_DbGet %LINK-I-UDFSYM, Java_edu_stanford_slac_aida_slc_SlcI_1impl_DbInit %LINK-I-UDFSYM, Java_edu_stanford_slac_aida_slc_SlcI_1impl_DbPut %LINK-W-USEUNDEFSYMV, undefined symbol Java_edu_stanford_slac_aida_slc_SlcI_1impl_DbInit referenced in symbol vector option %LINK-W-USEUNDEFSYMV, undefined symbol Java_edu_stanford_slac_aida_slc_SlcI_1impl_DbGet referenced in symbol vector option %LINK-W-USEUNDEFSYMV, undefined symbol Java_edu_stanford_slac_aida_slc_SlcI_1impl_DbPut referenced in symbol vector option %DCL-I-SUPERSEDE, previous value of CORBADBSHR has been superseded |
Cause: Note that in this case the undefined symbols were in the symbol vector option. The reason is that the symbols are > 31 characters long, and the linker .OPT file that defined them as UNIVERSAL refered to them with their full name. But java shortens names over 31 charaacters. The VMS utility DCL file SCAN_GLOBALS_FOR_OPTION.COM should be run on the .OBJ file to build the .OPT file, translating the long names to ones < 31 characters using the same algorithm as used by the javac compiler. |
Solution: Get SCAN_GLOBALS_FOR_OPTION.COM and run it on the .obj files which define the functions whose symbols are undefined in the message. SCAN_GLOBALS_FOR_OPTION.COM requires JAVA$BUILD_OPTION.EXE and JAVA$STUBS_DEFINED.EXE, and both executables must be in the working directory from which SCAN_GLOBALS_FOR_OPTION.COM is run. SCAN_GLOBALS_FOR_OPTION.COM is part of the JNI_EXAMPLES saveset distributed from Compaq. It may be found in udslc:[rcs.java.exampes] |
[Aida Home Page][SLAC Controls Software Group][ SLAC Home Page]
Author: Greg
White, 15-Jul-2001
Modified by: