List Info

Thread: Timeout support - Please review




Timeout support - Please review
user name
2006-12-04 13:22:15
Hi,

Any and all occurrences of g_cond_wait() has the potential
to block
forever if the remote server is physically disconnected or
powered of.

The blocking behavior can be demonstrated (with pseudo code)
like this:

1) Get a server object reference:

   CORBA_ORB orb = get_orb();
   char *objref = get_corbaloc_str();
   CORBA_Object obj =
(CORBA_Object)CORBA_ORB_string_to_object(orb, objref, ev);


2) Invoke some method on the server object:

   CORBA_Environment ev[1];
   CORBA_exception_init(ev);
   FOO_INTERFACE_method(obj, <arguments>, ev); 


The above operation will block forever in
giop_recv_buffer_get() if the
remote server has been physically powered down. It
immediately throws an
exception if the server object is merely dead or
non-existent. 

As a rule of thumb - Blocking forever is generally bad
performance-wise
for the client application...


The appended patch therefore implements support for the
ex_CORBA_TIMEOUT
system exception by replacing most occurrences of
g_cond_wait() with
g_cond_timed_wait(). The only remaining instance of
g_cond_wait() is
found in link_exec_command(). I haven't observed any
problems with
link_exec_command() so I've not touched that code.

The affected code is:

1) New ORB option: GIOPTimeoutLimit - timeout limit in
microseconds.
Defaults to 30 seconds.

2) giop_recv_buffer_get(). This place is obvious. We do not
want to wait
indefinitely for data to be received from a downed host. The
waiting
interval is configurable with the new GIOPTimeoutLimit ORB
option.

3) Any code invoking link_wait():

   * link_connection_wait_connected_T()
   * link_connection_try_reconnect()
   * giop_connection_try_reconnect(), calls
link_connection_try_reconnect()
   * ORBit_try_connection_T(), calls
giop_connection_try_reconnect()
   * ORBit_object_get_connection(), calls
ORBit_try_connection_T()

4) The waiting time in link_wait() is set at compile time to
LINK_WAIT_TIMEOUT_USEC which is presently 10 seconds.

5) link_wait() has been modified to return a gboolean. FALSE
if the
timeout expired, TRUE otherwise.

6) link_connection_wait_connected_T() and
link_connection_try_reconnect() will both disconnect the
connection if
link_wait() experienced a timeout (returned FALSE).

Please review and test this patch. I'm not entirely
convinced that I
haven't introduced a bad bug or two but at least it works(*)
here.


Thanks,
  jules

(*) Tested with evolution-brutus on a dual Opteron Gentoo.


Index: ChangeLog
============================================================
=======
RCS file: /cvs/gnome/ORBit2/ChangeLog,v
retrieving revision 1.778
diff -u -p -r1.778 ChangeLog
--- ChangeLog	22 Nov 2006 20:34:39 -0000	1.778
+++ ChangeLog	4 Dec 2006 13:20:48 -0000
 -1,3
+1,25 
+2006-12-04  Jules Colding  <coldingomesc.com>
+
+	* configure.in: Added autoconf 2.60+ required datarootdir
variable
+
+2006-12-03  Jules Colding  <coldingomesc.com>
+
+	* include/orbit/GIOP/giop.h: Declare
giop_recv_set_timeout().
+
+	* src/orb/orb-core/corba-orb.c (CORBA_ORB_init): Set GIOP
timeout 
+	limit from the ORB option.
+	Added new ORB option - GIOPTimeoutLimit.
+
+	* src/orb/orb-core/orbit-small.c
(ORBit_small_invoke_stub): Throw a 
+	TIMEOUT exception if a reply hasn't been recieved within
the GIOP
+	timeout limit.
+
+	* src/orb/GIOP/giop-recv-buffer.c (giop_recv_buffer_get):
Replaced 
+	a g_cond_wait() with a g_cond_timed_wait(). The original
g_cond_wait()
+	could potentially block forever if a remote server was
physically
+	offline.
+	(giop_recv_set_timeout): Function to adjust the GIOP
timeout limit.
+
 2006-11-22  Kjartan Maraas  <kmaraasgnome.org>
 
 	* ORBit-2.0.pc.in: Move MINGW_CFLAGS and LIBM to
Libs.private
Index: configure.in
============================================================
=======
RCS file: /cvs/gnome/ORBit2/configure.in,v
retrieving revision 1.175
diff -u -p -r1.175 configure.in
--- configure.in	22 Nov 2006 20:34:41 -0000	1.175
+++ configure.in	4 Dec 2006 13:20:48 -0000
 -37,6
+37,9  AM_CONFIG_HEADER(config.h)
 dnl Initialize automake stuff
 AM_INIT_AUTOMAKE(ORBit2, $ORBIT_VERSION, no-define)
 
+dnl Required by autoconf 2.60
+AC_SUBST(datarootdir)
+
 AC_CANONICAL_HOST
 AC_MSG_CHECKING([for Win32])
 case "$host" in
Index: include/orbit/GIOP/giop-recv-buffer.h
============================================================
=======
RCS file:
/cvs/gnome/ORBit2/include/orbit/GIOP/giop-recv-buffer.h,v
retrieving revision 1.33
diff -u -p -r1.33 giop-recv-buffer.h
--- include/orbit/GIOP/giop-recv-buffer.h	14 Jan 2004
11:04:01 -0000	1.33
+++ include/orbit/GIOP/giop-recv-buffer.h	4 Dec 2006
13:20:48 -0000
 -58,7
+58,8  void            giop_recv_list_setup_que
 void            giop_recv_list_setup_queue_entry_async
(GIOPMessageQueueEntry *ent,
 							GIOPAsyncCallback      cb);
 
-GIOPRecvBuffer *giop_recv_buffer_get              
(GIOPMessageQueueEntry *ent);
+GIOPRecvBuffer *giop_recv_buffer_get              
(GIOPMessageQueueEntry *ent,
+						    gboolean *timeout);
 void            giop_recv_buffer_unuse            
(GIOPRecvBuffer        *buf);
 
 #define giop_recv_buffer_reply_status(buf) (               
                            
Index: include/orbit/GIOP/giop-types.h
============================================================
=======
RCS file:
/cvs/gnome/ORBit2/include/orbit/GIOP/giop-types.h,v
retrieving revision 1.21
diff -u -p -r1.21 giop-types.h
--- include/orbit/GIOP/giop-types.h	27 Oct 2003 16:14:12
-0000	1.21
+++ include/orbit/GIOP/giop-types.h	4 Dec 2006 13:20:48
-0000
 -35,7
+35,9  struct _GIOPThread {
 					gpointer dummy);
 };
 
-#define GIOP_INITIAL_MSG_SIZE_LIMIT 256*1024
+#define GIOP_INITIAL_TIMEOUT_LIMIT (30000000) /* 30 seconds
*/
+
+#define GIOP_INITIAL_MSG_SIZE_LIMIT (256*1024) /* in bytes
*/
 
 typedef enum {
 	GIOP_CONNECTION_SSL
Index: include/orbit/GIOP/giop.h
============================================================
=======
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop.h,v
retrieving revision 1.24
diff -u -p -r1.24 giop.h
--- include/orbit/GIOP/giop.h	5 Apr 2006 07:16:10 -0000	1.24
+++ include/orbit/GIOP/giop.h	4 Dec 2006 13:20:48 -0000
 -23,6
+23,7  gboolean    giop_thread_safe       (void
 gboolean    giop_thread_io         (void);
 GIOPThread *giop_thread_self       (void);
 void        giop_invoke_async      (GIOPMessageQueueEntry
*ent);
+void        giop_recv_set_timeout  (const glong timeout);
 void        giop_recv_set_limit    (glong limit);
 glong       giop_recv_get_limit    (void);
 void        giop_incoming_signal_T (GIOPThread *tdata,
GIOPMsgType t);
 -45,6
+46,7  void        giop_thread_new_check       
 void        giop_thread_queue_process    (GIOPThread
*tdata);
 gboolean    giop_thread_queue_empty_T    (GIOPThread
*tdata);
 void        giop_thread_queue_tail_wakeup(GIOPThread
*tdata);
+
 
 #endif /* ORBIT2_INTERNAL_API */
 
Index: linc2/ChangeLog
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/ChangeLog,v
retrieving revision 1.262
diff -u -p -r1.262 ChangeLog
--- linc2/ChangeLog	22 Nov 2006 19:13:41 -0000	1.262
+++ linc2/ChangeLog	4 Dec 2006 13:20:49 -0000
 -1,3
+1,8 
+2006-12-04  Jules Colding  <coldingomesc.com>
+
+	* src/linc.c (link_wait): Do not wait forever for the link
+	condition to get signaled.
+
 2006-11-22  Kjartan Maraas  <kmaraasgnome.org>
 
 	* src/Makefile.am: Remove SSL_LIBS and SSL_CFLAGS.
Index: linc2/include/linc/linc.h
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/include/linc/linc.h,v
retrieving revision 1.17
diff -u -p -r1.17 linc.h
--- linc2/include/linc/linc.h	28 Jan 2005 12:34:57
-0000	1.17
+++ linc2/include/linc/linc.h	4 Dec 2006 13:20:49 -0000
 -33,7
+33,7  GMainLoop *link_main_get_loop    (void);
 guint      link_main_idle_add    (GSourceFunc function,
 				  gpointer    data);
 
-void       link_wait             (void);
+gboolean   link_wait             (void);
 void       link_signal           (void);
 
 gboolean   link_thread_io        (void);
Index: linc2/src/linc-connection.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/src/linc-connection.c,v
retrieving revision 1.117
diff -u -p -r1.117 linc-connection.c
--- linc2/src/linc-connection.c	9 Sep 2006 07:54:37
-0000	1.117
+++ linc2/src/linc-connection.c	4 Dec 2006 13:20:49 -0000
 -635,8
+635,10  link_connection_do_initiate (LinkConnect
 static LinkConnectionStatus
 link_connection_wait_connected_T (LinkConnection *cnx)
 {
-	while (cnx && cnx->status == LINK_CONNECTING)
-		link_wait ();
+	while (cnx && cnx->status == LINK_CONNECTING) {
+		if (!link_wait ())
+			link_connection_disconnect (cnx);
+	}
 
 	return cnx ? cnx->status : LINK_DISCONNECTED;
 }
 -659,8
+661,12  link_connection_try_reconnect (LinkConne
 			cnx->inhibit_reconnect = FALSE;
 			dispatch_callbacks_drop_lock (cnx);
 			g_main_context_release (NULL);
-		} else 
-			link_wait ();
+		} else {
+			if (!link_wait ()) {
+				link_connection_disconnect (cnx);
+				break;
+			}
+		}
 	}
 
 	if (cnx->status != LINK_DISCONNECTED)
Index: linc2/src/linc.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/src/linc.c,v
retrieving revision 1.63
diff -u -p -r1.63 linc.c
--- linc2/src/linc.c	2 Jun 2006 08:14:22 -0000	1.63
+++ linc2/src/linc.c	4 Dec 2006 13:20:49 -0000
 -52,6
+52,9  SSL_METHOD *link_ssl_method;
 SSL_CTX    *link_ssl_ctx;
 #endif
 
+/* max time to wait for the link condition to get signaled
- 10 seconds */
+#define LINK_WAIT_TIMEOUT_USEC (10000000) 
+
 static void link_dispatch_command (gpointer data, gboolean
immediate);
 
 gboolean
 -534,17
+537,28  link_signal (void)
 	}
 }
 
-void
+gboolean
 link_wait (void)
 {
+	GTimeVal gtime;
+
 	if (!(link_is_thread_safe &&
link_is_io_in_thread)) {
 		link_unlock ();
 		link_main_iteration (TRUE);
 		link_lock ();
 	} else {
 		g_assert (link_main_cond != NULL);
-		g_cond_wait (link_main_cond, link_main_lock);
+
+		g_get_current_time (&gtime);
+		g_time_val_add (&gtime, LINK_WAIT_TIMEOUT_USEC);
+		if (!g_cond_timed_wait (link_main_cond, link_main_lock,
&gtime)) {
+			if (link_is_locked ())
+				link_unlock ();
+			return FALSE;
+		}
 	}
+
+	return TRUE;
 }
 
 
Index: src/orb/GIOP/giop-recv-buffer.c
============================================================
=======
RCS file:
/cvs/gnome/ORBit2/src/orb/GIOP/giop-recv-buffer.c,v
retrieving revision 1.81
diff -u -p -r1.81 giop-recv-buffer.c
--- src/orb/GIOP/giop-recv-buffer.c	5 Apr 2006 07:17:17
-0000	1.81
+++ src/orb/GIOP/giop-recv-buffer.c	4 Dec 2006 13:20:50
-0000
 -690,10
+690,21  check_got (GIOPMessageQueueEntry *ent)
 		(ent->cnx->parent.status == LINK_DISCONNECTED));
 }
 
+static glong giop_initial_timeout_limit =
GIOP_INITIAL_TIMEOUT_LIMIT;
+
+void
+giop_recv_set_timeout (const glong timeout)
+{
+	if (0 < timeout) /* We really do not want (timeout
<= 0) as that would potentially block forever */
+		giop_initial_timeout_limit = timeout;
+}
+
 GIOPRecvBuffer *
-giop_recv_buffer_get (GIOPMessageQueueEntry *ent)
+giop_recv_buffer_get (GIOPMessageQueueEntry *ent,
+		      gboolean *timeout)
 {
 	GIOPThread *tdata = giop_thread_self ();
+	GTimeVal tval;
 
  thread_switch:
 	if (giop_thread_io ()) {
 -704,8
+715,17  giop_recv_buffer_get (GIOPMessageQueueEn
 				ent_unlock (ent);
 				giop_thread_queue_process (tdata);
 				ent_lock (ent);
-			} else
-				g_cond_wait (tdata->incoming, tdata->lock);
+			} else {
+				if (0 < giop_initial_timeout_limit) {
+					g_get_current_time (&tval);
+					g_time_val_add (&tval,
giop_initial_timeout_limit);
+				}
+				if (!g_cond_timed_wait (tdata->incoming,
tdata->lock, ((0 < giop_initial_timeout_limit) ?
&tval : NULL))) {
+					*timeout = TRUE;
+					break;
+				} else
+					*timeout = FALSE;
+			}
 		}
 		
 		ent_unlock (ent);
 -1352,3
+1372,4  giop_recv_buffer_use_buf (GIOPConnection
 
 	return buf;
 }
+
Index: src/orb/orb-core/corba-orb.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/corba-orb.c,v
retrieving revision 1.123
diff -u -p -r1.123 corba-orb.c
--- src/orb/orb-core/corba-orb.c	16 Aug 2006 09:28:30
-0000	1.123
+++ src/orb/orb-core/corba-orb.c	4 Dec 2006 13:20:51 -0000
 -61,7
+61,7  static char        *orbit_debug_options 
 static char        *orbit_naming_ref         = NULL;
 static GSList      *orbit_initref_list       = NULL; 
 static gboolean     orbit_use_corbaloc       = FALSE;
-
+static gint         orbit_timeout_limit      = -1;
 void
 ORBit_ORB_start_servers (CORBA_ORB orb)
 {
 -417,8
+417,8  CORBA_ORB_init (int *argc, char **argv,
 	}
 #endif /* G_ENABLE_DEBUG */
 
-
 	giop_recv_set_limit (orbit_initial_recv_limit);
+	giop_recv_set_timeout (orbit_timeout_limit);
 	giop_init (thread_safe,
 		   orbit_use_ipv4 || orbit_use_ipv6 ||
 		   orbit_use_irda || orbit_use_ssl);
 -1467,6
+1467,7  const ORBit_option orbit_supported_optio
 	{ "ORBDebugFlags",      ORBIT_OPTION_STRING, 
&orbit_debug_options },
 	{ "ORBInitRef",         ORBIT_OPTION_KEY_VALUE, 
&orbit_initref_list},
 	{ "ORBCorbaloc",        ORBIT_OPTION_BOOLEAN,
&orbit_use_corbaloc},
+	{ "GIOPTimeoutLimit",   ORBIT_OPTION_INT,    
&orbit_timeout_limit },
 	{ NULL,                 0,                    NULL },
 };
 
Index: src/orb/orb-core/orbit-small.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/orbit-small.c,v
retrieving revision 1.97
diff -u -p -r1.97 orbit-small.c
--- src/orb/orb-core/orbit-small.c	18 May 2006 14:20:49
-0000	1.97
+++ src/orb/orb-core/orbit-small.c	4 Dec 2006 13:20:51 -0000
 -591,6
+591,7  ORBit_small_invoke_stub (CORBA_Object   
 	GIOPRecvBuffer         *recv_buffer = NULL;
 	CORBA_Object            xt_proxy = CORBA_OBJECT_NIL;
 	ORBitPolicy            *invoke_policy = CORBA_OBJECT_NIL;
+	gboolean                timeout = FALSE;
 
 	if (!obj) {
 		dprintf (MESSAGES, "Cannot invoke method on null
objectn");
 -654,7
+655,9  ORBit_small_invoke_stub (CORBA_Object   
 		goto clean_out;
 	}
 
-	recv_buffer = giop_recv_buffer_get (&mqe);
+	recv_buffer = giop_recv_buffer_get (&mqe,
&timeout);
+	if (timeout)
+		goto timeout_exception;
 
 	switch (orbit_small_demarshal (obj, &cnx, recv_buffer,
ev,
 				       ret, m_data, args))
 -698,6
+701,12  ORBit_small_invoke_stub (CORBA_Object   
 	tprintf ("[System exception comm failure] )");
 	CORBA_exception_set_system (ev, ex_CORBA_COMM_FAILURE,
 				    completion_status);
+	goto clean_out;
+
+ timeout_exception:
+	tprintf ("[System exception timeout] )");
+	CORBA_exception_set_system (ev, ex_CORBA_TIMEOUT,
+				    CORBA_COMPLETED_NO);
 	goto clean_out;
 }
 


_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list
Timeout support - Please review
user name
2006-12-05 08:50:06
On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> Please review and test this patch. I'm not entirely
convinced that I
> haven't introduced a bad bug or two but at least it
works(*) here.

No responses yet, so it seems that my patch is of the usual
flawless
quality  

Anyway, I'll commit tomorrow if there is no objections.

Best regards,
  jules


_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list
Timeout support - Please review
user name
2006-12-05 10:47:35
On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> > Please review and test this patch. I'm not
entirely convinced that I
> > haven't introduced a bad bug or two but at least
it works(*) here.
> 
> No responses yet, so it seems that my patch is of the
usual flawless
> quality  

  Heh 
  
  In fact, the patch sounds really nice, a sign of ORBit2
maturing for
TCP/IP connections (which is traditionally seldom tested).

  My only comment is that the changelog entry could use a
bit more
detail.  You only mention changes in one function, but the
patch changes
4 functions and even adds a new commandline option which
deserves a
mention in the changelog too.

  Cheers,

-- 
Gustavo J. A. M. Carneiro
<gjcinescporto.pt> <gustavousers.sourceforge.net>
The universe is always one step beyond logic.

_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list
Timeout support - Please review
user name
2006-12-05 10:56:21
On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M. Carneiro
wrote:
> On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> > On Mon, 2006-12-04 at 14:22 +0100, Jules Colding
wrote:
> > > Please review and test this patch. I'm not
entirely convinced that I
> > > haven't introduced a bad bug or two but at
least it works(*) here.
> > 
> > No responses yet, so it seems that my patch is of
the usual flawless
> > quality  
> 
>   Heh 
>   
>   In fact, the patch sounds really nice, a sign of
ORBit2 maturing for
> TCP/IP connections (which is traditionally seldom
tested).
> 
>   My only comment is that the changelog entry could use
a bit more
> detail.  You only mention changes in one function, but
the patch changes
> 4 functions and even adds a new commandline option
which deserves a
> mention in the changelog too.

OK, I'll expand the ChangeLog and resend the patch 


Best regards,
  jules


_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list
Timeout support - Please review
user name
2006-12-05 12:06:27
On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M. Carneiro
wrote:
> On Ter, 2006-12-05 at 09:50 +0100, Jules Colding wrote:
> > On Mon, 2006-12-04 at 14:22 +0100, Jules Colding
wrote:
> > > Please review and test this patch. I'm not
entirely convinced that I
> > > haven't introduced a bad bug or two but at
least it works(*) here.
> > 
> > No responses yet, so it seems that my patch is of
the usual flawless
> > quality  
> 
>   Heh 
>   
>   In fact, the patch sounds really nice, a sign of
ORBit2 maturing for
> TCP/IP connections (which is traditionally seldom
tested).
> 
>   My only comment is that the changelog entry could use
a bit more
> detail.  You only mention changes in one function, but
the patch changes
> 4 functions and even adds a new commandline option
which deserves a
> mention in the changelog too.

Here is the new and better documented patch.

Best regards,
  jules


Index: ChangeLog
============================================================
=======
RCS file: /cvs/gnome/ORBit2/ChangeLog,v
retrieving revision 1.778
diff -u -p -r1.778 ChangeLog
--- ChangeLog	22 Nov 2006 20:34:39 -0000	1.778
+++ ChangeLog	5 Dec 2006 12:04:04 -0000
 -1,3
+1,95 
+2006-12-05  Jules Colding  <coldingomesc.com>
+
+	* ORBit2: The previous two ChangeLog entries are hiding a
lot of 
+	details which I'll hereby try to expand upon. 
+
+	The rationale for these changes is that any and all
occurrences of 
+	g_cond_wait() in ORBit2 has the potential to block forever
if it 
+	waits for a connection attempt to complete if said
connection 
+	attempt is made towards a remote server which happens to
be 
+	physically disconnected or powered off. 
+
+	This blocking behavior can be demonstrated by invoking an
operation 
+	on a remote object that is physically inaccessible such as
when
+	the remote server is powered down or physically
disconnected.
+
+	Pseudo code:
+	
+	{
+	    CORBA_ORB orb = get_orb();
+	    char *objref = get_corbaloc_str();
+	    CORBA_Object obj = CORBA_OBJECT_NIL;
+	    CORBA_Environment ev[1];
+	
+	    CORBA_exception_init(ev);
+	    obj = CORBA_ORB_string_to_object(orb, objref, ev);
+
+	    FOO_INTERFACE_method(obj, <arguments>, ev); 
+	}
+
+	The last statement above will block forever in
g_cond_wait() while
+	waiting for the recieve buffer to be ready if the remote
server is, 
+	say, powered down. An exception is on the other hand
immediately 
+	thrown if the server object is merely gone.
+
+	The changes to the code therefore boils down to the
implementation 
+	of the ex_CORBA_TIMEOUT system exception being thrown if 
+	g_cond_timed_wait() times out. Most occurrences of
g_cond_wait() 
+	has been replaced by g_cond_timed_wait(). 
+
+	The only remaining instance of g_cond_wait() is found in 
+	link_exec_command(). I haven't observed any problems with 
+	link_exec_command() so I've not touched that code.
+
+	The affected code is:
+
+	1) A new ORB command line option: GIOPTimeoutLimit -
timeout limit 
+	in microseconds. Defaults to 30 seconds.
+
+	2) giop_recv_buffer_get(). This one is obvious. We do not
want to 
+	wait indefinitely for data to be received from a downed
host. The 
+	waiting interval is configurable with the new
GIOPTimeoutLimit ORB 
+	option.
+
+	3) Any code invoking link_wait():
+
+	   a) link_connection_wait_connected_T()
+	   b) link_connection_try_reconnect()
+	   c) giop_connection_try_reconnect(), calls
link_connection_try_reconnect()
+	   d) ORBit_try_connection_T(), calls
giop_connection_try_reconnect()
+	   e) ORBit_object_get_connection(), calls
ORBit_try_connection_T()
+
+	4) The waiting time in link_wait() is set at compile time
to 
+	LINK_WAIT_TIMEOUT_USEC which is presently 10 seconds.
+
+	5) link_wait() has been modified to return a gboolean.
FALSE if the
+	timeout expired, TRUE otherwise.
+
+	6) link_connection_wait_connected_T() and
link_connection_try_reconnect() 
+	will both disconnect the connection if link_wait()
experienced a timeout.
+
+2006-12-04  Jules Colding  <coldingomesc.com>
+
+	* configure.in: Added autoconf 2.60+ required datarootdir
variable
+
+2006-12-03  Jules Colding  <coldingomesc.com>
+
+	* include/orbit/GIOP/giop.h: Declare
giop_recv_set_timeout().
+
+	* src/orb/orb-core/corba-orb.c (CORBA_ORB_init): Set GIOP
timeout 
+	limit from the ORB option.
+	Added new ORB option - GIOPTimeoutLimit.
+
+	* src/orb/orb-core/orbit-small.c
(ORBit_small_invoke_stub): Throw a 
+	TIMEOUT exception if a reply hasn't been recieved within
the GIOP
+	timeout limit.
+
+	* src/orb/GIOP/giop-recv-buffer.c (giop_recv_buffer_get):
Replaced 
+	a g_cond_wait() with a g_cond_timed_wait(). The original
g_cond_wait()
+	could potentially block forever if a remote server was
physically
+	offline.
+	(giop_recv_set_timeout): Function to adjust the GIOP
timeout limit.
+
 2006-11-22  Kjartan Maraas  <kmaraasgnome.org>
 
 	* ORBit-2.0.pc.in: Move MINGW_CFLAGS and LIBM to
Libs.private
Index: configure.in
============================================================
=======
RCS file: /cvs/gnome/ORBit2/configure.in,v
retrieving revision 1.175
diff -u -p -r1.175 configure.in
--- configure.in	22 Nov 2006 20:34:41 -0000	1.175
+++ configure.in	5 Dec 2006 12:04:04 -0000
 -37,6
+37,9  AM_CONFIG_HEADER(config.h)
 dnl Initialize automake stuff
 AM_INIT_AUTOMAKE(ORBit2, $ORBIT_VERSION, no-define)
 
+dnl Required by autoconf 2.60
+AC_SUBST(datarootdir)
+
 AC_CANONICAL_HOST
 AC_MSG_CHECKING([for Win32])
 case "$host" in
Index: include/orbit/GIOP/giop-recv-buffer.h
============================================================
=======
RCS file:
/cvs/gnome/ORBit2/include/orbit/GIOP/giop-recv-buffer.h,v
retrieving revision 1.33
diff -u -p -r1.33 giop-recv-buffer.h
--- include/orbit/GIOP/giop-recv-buffer.h	14 Jan 2004
11:04:01 -0000	1.33
+++ include/orbit/GIOP/giop-recv-buffer.h	5 Dec 2006
12:04:04 -0000
 -58,7
+58,8  void            giop_recv_list_setup_que
 void            giop_recv_list_setup_queue_entry_async
(GIOPMessageQueueEntry *ent,
 							GIOPAsyncCallback      cb);
 
-GIOPRecvBuffer *giop_recv_buffer_get              
(GIOPMessageQueueEntry *ent);
+GIOPRecvBuffer *giop_recv_buffer_get              
(GIOPMessageQueueEntry *ent,
+						    gboolean *timeout);
 void            giop_recv_buffer_unuse            
(GIOPRecvBuffer        *buf);
 
 #define giop_recv_buffer_reply_status(buf) (               
                            
Index: include/orbit/GIOP/giop-types.h
============================================================
=======
RCS file:
/cvs/gnome/ORBit2/include/orbit/GIOP/giop-types.h,v
retrieving revision 1.21
diff -u -p -r1.21 giop-types.h
--- include/orbit/GIOP/giop-types.h	27 Oct 2003 16:14:12
-0000	1.21
+++ include/orbit/GIOP/giop-types.h	5 Dec 2006 12:04:04
-0000
 -35,7
+35,9  struct _GIOPThread {
 					gpointer dummy);
 };
 
-#define GIOP_INITIAL_MSG_SIZE_LIMIT 256*1024
+#define GIOP_INITIAL_TIMEOUT_LIMIT (30000000) /* 30 seconds
*/
+
+#define GIOP_INITIAL_MSG_SIZE_LIMIT (256*1024) /* in bytes
*/
 
 typedef enum {
 	GIOP_CONNECTION_SSL
Index: include/orbit/GIOP/giop.h
============================================================
=======
RCS file: /cvs/gnome/ORBit2/include/orbit/GIOP/giop.h,v
retrieving revision 1.24
diff -u -p -r1.24 giop.h
--- include/orbit/GIOP/giop.h	5 Apr 2006 07:16:10 -0000	1.24
+++ include/orbit/GIOP/giop.h	5 Dec 2006 12:04:04 -0000
 -23,6
+23,7  gboolean    giop_thread_safe       (void
 gboolean    giop_thread_io         (void);
 GIOPThread *giop_thread_self       (void);
 void        giop_invoke_async      (GIOPMessageQueueEntry
*ent);
+void        giop_recv_set_timeout  (const glong timeout);
 void        giop_recv_set_limit    (glong limit);
 glong       giop_recv_get_limit    (void);
 void        giop_incoming_signal_T (GIOPThread *tdata,
GIOPMsgType t);
 -45,6
+46,7  void        giop_thread_new_check       
 void        giop_thread_queue_process    (GIOPThread
*tdata);
 gboolean    giop_thread_queue_empty_T    (GIOPThread
*tdata);
 void        giop_thread_queue_tail_wakeup(GIOPThread
*tdata);
+
 
 #endif /* ORBIT2_INTERNAL_API */
 
Index: linc2/ChangeLog
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/ChangeLog,v
retrieving revision 1.262
diff -u -p -r1.262 ChangeLog
--- linc2/ChangeLog	22 Nov 2006 19:13:41 -0000	1.262
+++ linc2/ChangeLog	5 Dec 2006 12:04:05 -0000
 -1,3
+1,8 
+2006-12-04  Jules Colding  <coldingomesc.com>
+
+	* src/linc.c (link_wait): Do not wait forever for the link
+	condition to get signaled.
+
 2006-11-22  Kjartan Maraas  <kmaraasgnome.org>
 
 	* src/Makefile.am: Remove SSL_LIBS and SSL_CFLAGS.
Index: linc2/include/linc/linc.h
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/include/linc/linc.h,v
retrieving revision 1.17
diff -u -p -r1.17 linc.h
--- linc2/include/linc/linc.h	28 Jan 2005 12:34:57
-0000	1.17
+++ linc2/include/linc/linc.h	5 Dec 2006 12:04:05 -0000
 -33,7
+33,7  GMainLoop *link_main_get_loop    (void);
 guint      link_main_idle_add    (GSourceFunc function,
 				  gpointer    data);
 
-void       link_wait             (void);
+gboolean   link_wait             (void);
 void       link_signal           (void);
 
 gboolean   link_thread_io        (void);
Index: linc2/src/linc-connection.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/src/linc-connection.c,v
retrieving revision 1.117
diff -u -p -r1.117 linc-connection.c
--- linc2/src/linc-connection.c	9 Sep 2006 07:54:37
-0000	1.117
+++ linc2/src/linc-connection.c	5 Dec 2006 12:04:06 -0000
 -635,8
+635,10  link_connection_do_initiate (LinkConnect
 static LinkConnectionStatus
 link_connection_wait_connected_T (LinkConnection *cnx)
 {
-	while (cnx && cnx->status == LINK_CONNECTING)
-		link_wait ();
+	while (cnx && cnx->status == LINK_CONNECTING) {
+		if (!link_wait ())
+			link_connection_disconnect (cnx);
+	}
 
 	return cnx ? cnx->status : LINK_DISCONNECTED;
 }
 -659,8
+661,12  link_connection_try_reconnect (LinkConne
 			cnx->inhibit_reconnect = FALSE;
 			dispatch_callbacks_drop_lock (cnx);
 			g_main_context_release (NULL);
-		} else 
-			link_wait ();
+		} else {
+			if (!link_wait ()) {
+				link_connection_disconnect (cnx);
+				break;
+			}
+		}
 	}
 
 	if (cnx->status != LINK_DISCONNECTED)
Index: linc2/src/linc.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/linc2/src/linc.c,v
retrieving revision 1.63
diff -u -p -r1.63 linc.c
--- linc2/src/linc.c	2 Jun 2006 08:14:22 -0000	1.63
+++ linc2/src/linc.c	5 Dec 2006 12:04:06 -0000
 -52,6
+52,9  SSL_METHOD *link_ssl_method;
 SSL_CTX    *link_ssl_ctx;
 #endif
 
+/* max time to wait for the link condition to get signaled
- 10 seconds */
+#define LINK_WAIT_TIMEOUT_USEC (10000000) 
+
 static void link_dispatch_command (gpointer data, gboolean
immediate);
 
 gboolean
 -534,17
+537,28  link_signal (void)
 	}
 }
 
-void
+gboolean
 link_wait (void)
 {
+	GTimeVal gtime;
+
 	if (!(link_is_thread_safe &&
link_is_io_in_thread)) {
 		link_unlock ();
 		link_main_iteration (TRUE);
 		link_lock ();
 	} else {
 		g_assert (link_main_cond != NULL);
-		g_cond_wait (link_main_cond, link_main_lock);
+
+		g_get_current_time (&gtime);
+		g_time_val_add (&gtime, LINK_WAIT_TIMEOUT_USEC);
+		if (!g_cond_timed_wait (link_main_cond, link_main_lock,
&gtime)) {
+			if (link_is_locked ())
+				link_unlock ();
+			return FALSE;
+		}
 	}
+
+	return TRUE;
 }
 
 
Index: src/orb/GIOP/giop-recv-buffer.c
============================================================
=======
RCS file:
/cvs/gnome/ORBit2/src/orb/GIOP/giop-recv-buffer.c,v
retrieving revision 1.81
diff -u -p -r1.81 giop-recv-buffer.c
--- src/orb/GIOP/giop-recv-buffer.c	5 Apr 2006 07:17:17
-0000	1.81
+++ src/orb/GIOP/giop-recv-buffer.c	5 Dec 2006 12:04:07
-0000
 -690,10
+690,21  check_got (GIOPMessageQueueEntry *ent)
 		(ent->cnx->parent.status == LINK_DISCONNECTED));
 }
 
+static glong giop_initial_timeout_limit =
GIOP_INITIAL_TIMEOUT_LIMIT;
+
+void
+giop_recv_set_timeout (const glong timeout)
+{
+	if (0 < timeout) /* We really do not want (timeout
<= 0) as that would potentially block forever */
+		giop_initial_timeout_limit = timeout;
+}
+
 GIOPRecvBuffer *
-giop_recv_buffer_get (GIOPMessageQueueEntry *ent)
+giop_recv_buffer_get (GIOPMessageQueueEntry *ent,
+		      gboolean *timeout)
 {
 	GIOPThread *tdata = giop_thread_self ();
+	GTimeVal tval;
 
  thread_switch:
 	if (giop_thread_io ()) {
 -704,8
+715,17  giop_recv_buffer_get (GIOPMessageQueueEn
 				ent_unlock (ent);
 				giop_thread_queue_process (tdata);
 				ent_lock (ent);
-			} else
-				g_cond_wait (tdata->incoming, tdata->lock);
+			} else {
+				if (0 < giop_initial_timeout_limit) {
+					g_get_current_time (&tval);
+					g_time_val_add (&tval,
giop_initial_timeout_limit);
+				}
+				if (!g_cond_timed_wait (tdata->incoming,
tdata->lock, ((0 < giop_initial_timeout_limit) ?
&tval : NULL))) {
+					*timeout = TRUE;
+					break;
+				} else
+					*timeout = FALSE;
+			}
 		}
 		
 		ent_unlock (ent);
 -1352,3
+1372,4  giop_recv_buffer_use_buf (GIOPConnection
 
 	return buf;
 }
+
Index: src/orb/orb-core/corba-orb.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/corba-orb.c,v
retrieving revision 1.123
diff -u -p -r1.123 corba-orb.c
--- src/orb/orb-core/corba-orb.c	16 Aug 2006 09:28:30
-0000	1.123
+++ src/orb/orb-core/corba-orb.c	5 Dec 2006 12:04:07 -0000
 -61,7
+61,7  static char        *orbit_debug_options 
 static char        *orbit_naming_ref         = NULL;
 static GSList      *orbit_initref_list       = NULL; 
 static gboolean     orbit_use_corbaloc       = FALSE;
-
+static gint         orbit_timeout_limit      = -1;
 void
 ORBit_ORB_start_servers (CORBA_ORB orb)
 {
 -417,8
+417,8  CORBA_ORB_init (int *argc, char **argv,
 	}
 #endif /* G_ENABLE_DEBUG */
 
-
 	giop_recv_set_limit (orbit_initial_recv_limit);
+	giop_recv_set_timeout (orbit_timeout_limit);
 	giop_init (thread_safe,
 		   orbit_use_ipv4 || orbit_use_ipv6 ||
 		   orbit_use_irda || orbit_use_ssl);
 -1467,6
+1467,7  const ORBit_option orbit_supported_optio
 	{ "ORBDebugFlags",      ORBIT_OPTION_STRING, 
&orbit_debug_options },
 	{ "ORBInitRef",         ORBIT_OPTION_KEY_VALUE, 
&orbit_initref_list},
 	{ "ORBCorbaloc",        ORBIT_OPTION_BOOLEAN,
&orbit_use_corbaloc},
+	{ "GIOPTimeoutLimit",   ORBIT_OPTION_INT,    
&orbit_timeout_limit },
 	{ NULL,                 0,                    NULL },
 };
 
Index: src/orb/orb-core/orbit-small.c
============================================================
=======
RCS file: /cvs/gnome/ORBit2/src/orb/orb-core/orbit-small.c,v
retrieving revision 1.97
diff -u -p -r1.97 orbit-small.c
--- src/orb/orb-core/orbit-small.c	18 May 2006 14:20:49
-0000	1.97
+++ src/orb/orb-core/orbit-small.c	5 Dec 2006 12:04:08 -0000
 -591,6
+591,7  ORBit_small_invoke_stub (CORBA_Object   
 	GIOPRecvBuffer         *recv_buffer = NULL;
 	CORBA_Object            xt_proxy = CORBA_OBJECT_NIL;
 	ORBitPolicy            *invoke_policy = CORBA_OBJECT_NIL;
+	gboolean                timeout = FALSE;
 
 	if (!obj) {
 		dprintf (MESSAGES, "Cannot invoke method on null
objectn");
 -654,7
+655,9  ORBit_small_invoke_stub (CORBA_Object   
 		goto clean_out;
 	}
 
-	recv_buffer = giop_recv_buffer_get (&mqe);
+	recv_buffer = giop_recv_buffer_get (&mqe,
&timeout);
+	if (timeout)
+		goto timeout_exception;
 
 	switch (orbit_small_demarshal (obj, &cnx, recv_buffer,
ev,
 				       ret, m_data, args))
 -698,6
+701,12  ORBit_small_invoke_stub (CORBA_Object   
 	tprintf ("[System exception comm failure] )");
 	CORBA_exception_set_system (ev, ex_CORBA_COMM_FAILURE,
 				    completion_status);
+	goto clean_out;
+
+ timeout_exception:
+	tprintf ("[System exception timeout] )");
+	CORBA_exception_set_system (ev, ex_CORBA_TIMEOUT,
+				    CORBA_COMPLETED_NO);
 	goto clean_out;
 }
 


_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list
Timeout support - Please review
user name
2006-12-05 12:08:54
On Tue, 2006-12-05 at 13:06 +0100, Jules Colding wrote:
> On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M.
Carneiro wrote:
> > On Ter, 2006-12-05 at 09:50 +0100, Jules Colding
wrote:
> > > On Mon, 2006-12-04 at 14:22 +0100, Jules
Colding wrote:
> > > > Please review and test this patch. I'm
not entirely convinced that I
> > > > haven't introduced a bad bug or two but
at least it works(*) here.
> > > 
> > > No responses yet, so it seems that my patch
is of the usual flawless
> > > quality  
> > 
> >   Heh 
> >   
> >   In fact, the patch sounds really nice, a sign of
ORBit2 maturing for
> > TCP/IP connections (which is traditionally seldom
tested).
> > 
> >   My only comment is that the changelog entry
could use a bit more
> > detail.  You only mention changes in one function,
but the patch changes
> > 4 functions 

BTW: The changes to link_wait() is in linc2/ChangeLog.

-- 
  jules


_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list
Timeout support - Please review
user name
2006-12-05 13:22:19
On Ter, 2006-12-05 at 13:06 +0100, Jules Colding wrote:
> On Tue, 2006-12-05 at 10:47 +0000, Gustavo J. A. M.
Carneiro wrote:
> > On Ter, 2006-12-05 at 09:50 +0100, Jules Colding
wrote:
> > > On Mon, 2006-12-04 at 14:22 +0100, Jules
Colding wrote:
> > > > Please review and test this patch. I'm
not entirely convinced that I
> > > > haven't introduced a bad bug or two but
at least it works(*) here.
> > > 
> > > No responses yet, so it seems that my patch
is of the usual flawless
> > > quality  
> > 
> >   Heh 
> >   
> >   In fact, the patch sounds really nice, a sign of
ORBit2 maturing for
> > TCP/IP connections (which is traditionally seldom
tested).
> > 
> >   My only comment is that the changelog entry
could use a bit more
> > detail.  You only mention changes in one function,
but the patch changes
> > 4 functions and even adds a new commandline option
which deserves a
> > mention in the changelog too.
> 
> Here is the new and better documented patch.

  Wow!  I wasn't asking for so much level detail... 

  But I guess this is good, if a bit unusual 

  Best regards,

-- 
Gustavo J. A. M. Carneiro
<gjcinescporto.pt> <gustavousers.sourceforge.net>
The universe is always one step beyond logic.


_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list
Re: Timeout support - Please review
country flaguser name
United States
2007-06-12 04:02:13
Hi Jules,

On Mon, 2006-12-04 at 14:22 +0100, Jules Colding wrote:
> Any and all occurrences of g_cond_wait() has the
potential to block
> forever if the remote server is physically disconnected
or powered of.

	Urgh - but also, some methods take longer than 30 seconds
to execute,
and will not respond in that time.

	Also - when you are debugging, it is extremely normal to
want to pause
everything in the debugger to examine some interaction - and
you don't
expect everything to fail in unpredictable (or even silent
ways) when
you continue.

	So: can we do this for IPv4/6 connections only ? local unix

	    socket connections have strong lifecycle guarentees and
this
	    covers the desktop use-case nicely.

> +	* src/linc.c (link_wait): Do not wait forever for the
link
> +	condition to get signaled.
.. 
> +				if (!g_cond_timed_wait (tdata->incoming,
tdata->lock, ((0 < giop_initial_timeout_limit) ?
&tval : NULL))) {
> +					*timeout = TRUE;
> +					break;

	So I'm convinced this code is in the wrong place - worse,
the link_wait
function now -releases- a mutex it didn't take itself: which
looks
horribly un-maintainable: who took that lock ? and worse
doing:

	if (foo_is_locked())
		unlock_foo();

	is just grotesquely racey.

	So - IMHO, the -right- way to implement this is rather
simpler:

	* in the IO thread, we need to add a simple 'timeout'
source to
	  the glib mainloop, that will timeout after (however long)
and
	  in this case just push a constructed CORBA_COMM exception
into
	  a synthesised reply that we shove into the queue as
normal:
	  thus signalling the waiting caller.

	Surely that is simpler, easier, touches just 1 place, and
doesn't
introduce odd behavior ? 

	Thanks,

		Michael.

PS. if you really want patch review, can you CC me
explicitly.
-- 
 michael.meeksnovell.com  <><, Pseudo Engineer,
itinerant idiot


_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list

Re: Timeout support - Please review
country flaguser name
Denmark
2007-06-12 04:40:49
Hi Michael,

On Tue, 2007-06-12 at 10:02 +0100, Michael Meeks wrote:
> On Mon, 2006-12-04 at 14:22 +0100, Jules Colding
wrote:
> > Any and all occurrences of g_cond_wait() has the
potential to block
> > forever if the remote server is physically
disconnected or powered of.
> 
> 	Urgh - but also, some methods take longer than 30
seconds to execute,
> and will not respond in that time.

Yes, you are right.


> 	So: can we do this for IPv4/6 connections only ? local
unix 
> 	    socket connections have strong lifecycle
guarentees and this
> 	    covers the desktop use-case nicely.

Yes.


> > +	* src/linc.c (link_wait): Do not wait forever
for the link
> > +	condition to get signaled.
> .. 
> > +				if (!g_cond_timed_wait (tdata->incoming,
tdata->lock, ((0 < giop_initial_timeout_limit) ?
&tval : NULL))) {
> > +					*timeout = TRUE;
> > +					break;
> 
> 	So I'm convinced this code is in the wrong place -
worse, the link_wait
> function now -releases- a mutex it didn't take itself:
which looks
> horribly un-maintainable: who took that lock ? and
worse doing:
> 
> 	if (foo_is_locked())
> 		unlock_foo();
> 
> 	is just grotesquely racey.

Yes, I know. Despite my previous bragging about
"flawless quality" I was
actually a little worried about this too...


> 	So - IMHO, the -right- way to implement this is rather
simpler:
> 
> 	* in the IO thread, we need to add a simple 'timeout'
source to
> 	  the glib mainloop, that will timeout after (however
long) and
> 	  in this case just push a constructed CORBA_COMM
exception into
> 	  a synthesised reply that we shove into the queue as
normal:
> 	  thus signalling the waiting caller.
> 
> 	Surely that is simpler, easier, touches just 1 place,
and doesn't
> introduce odd behavior ? 

Absolutely. I'll reverse my previous patch and look into
this approach
instead. 

Stay tuned,
  jules



_______________________________________________
orbit-list mailing list
orbit-listgnome.org
htt
p://mail.gnome.org/mailman/listinfo/orbit-list

[1-9]

about | contact  Other archives ( Real Estate discussion Medical topics )