Thursday, August 30, 2007

Load Balancing for Wcf

Two days spent working onto issues which were side effects of incorrect Wcf configuration for the wsHttpBinding for load balanced environments:

http://msdn2.microsoft.com/en-us/library/ms730128.aspx
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=657708&SiteID=1
(Thanks to Anthony for digging those out while me and Conor were deeply into debugging :))

It can be really hard though to come to understanding that it is load balancer that might be causing problems. Only one of the intermittent scenarios led to the security exceptions in the service trace log caused by load balancer sending requests to the different server with which security context was initially communicated.

Other scenarios had no security exceptions until the first timeout happened. Then speculatively either load balancer was switching to the second server for the channel initially established via first server. We ran through long and intensive chain of investigation, so I can't remember all of the details now. The whole thing can be very intermittent.

The whole investigation was further complicated by having a client written as C++ application that is calling a WCF proxy replacement for COM+ components, which in turn is calling WCF service which in turn is calling COM+ components. Whole thing was done like this to substitute the Application Center ::).

Feeling something can be bad on a load balancer side, I reconfigured the service/client to use basicHttpBinding with windows integrated authentication. And it worked like a charm. Though the first article mentioned above gives very good idea of what should be done to load balance with wsHttpBinding, but see the note in the msdn article above:

Both the WSHttpBinding and the WSDualHttpBinding can be load balanced using HTTP load balancing techniques provided several modifications are made to the default binding configuration.

  • Turn off Security Context Establishment ...

  • Do not use reliable sessions. This feature is off by default.

So it seems that wsHttpBindings can't be used in the load balanced environment for reliable messaging. It is not obvious to me right now why not, going to investigate further.

Would not be possible to solve without:
Tools: WinDbg, Wcf Service trace viewer, Wcf Configuration Console, Process Explorer
Blogs:
http://blogs.msdn.com/tess/ - A must read for anybody doing in the core .net debugging!
Books: John Robbins - Debugging .NET 2.0 Applications - another must.
Great guys: Anthony, Conor

Technorati Tags: ,

1 comment:

Balvvant said...

Hi,
We have following server configuration in our live environment. We have load balancer before Web Servers (4 no’s) and another load balancer before WCF Application Servers (4 no’s). On both the load balancer sticky sessions are on. We are using WsHttp bidding for calling WCF services.



The problem is we are not able to run the application it gives following error.


http://msdn.microsoft.com/en-GB/library/System.ServiceModel.Diagnostics.ThrowingException.aspx
Throwing an exception.
/LM/W3SVC/1818710304/Root-3-129944332545306871

System.ServiceModel.Security.SecurityNegotiationException, System.ServiceModel, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
Cannot find the negotiation state for the context 'uuid-59d97ce1-198c-4095-bd0a-825cf4eb580c-56'.

at System.ServiceModel.Security.NegotiationTokenAuthenticator`1.ProcessRequestCore(Message request)
at System.ServiceModel.Security.NegotiationTokenAuthenticator`1.NegotiationHost.NegotiationSyncInvoker.Invoke(Object instance, Object[] inputs, Object[]& outputs)
at System.ServiceModel.Dispatcher.DispatchOperationRuntime.InvokeBegin(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage5(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage41(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage4(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage31(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage3(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage2(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage11(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.ImmutableDispatchRuntime.ProcessMessage1(MessageRpc& rpc)
at System.ServiceModel.Dispatcher.MessageRpc.Process(Boolean isOperationContextSet)
at System.ServiceModel.Dispatcher.ChannelHandler.DispatchAndReleasePump(RequestContext request, Boolean cleanThread, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.HandleRequest(RequestContext request, OperationContext currentOperationContext)
at System.ServiceModel.Dispatcher.ChannelHandler.AsyncMessagePump(IAsyncResult result)
at System.ServiceModel.Dispatcher.ChannelHandler.OnAsyncReceiveComplete(IAsyncResult result)
at System.Runtime.Fx.AsyncThunk.UnhandledExceptionFrame(IAsyncResult result)
at System.ServiceModel.Diagnostics.TraceUtility.<>c__DisplayClass4.<CallbackGenerator>b__2(AsyncCallback callback, IAsyncResult result)
at System.Runtime.AsyncResult.Complete(Boolean completedSynchronously)
at System.Runtime.InputQueue`1.AsyncQueueReader.Set(Item item)
at System.Runtime.InputQueue`1.Dispatch()
at System.Runtime.InputQueue`1.OnDispatchCallback(Object state)
at System.Runtime.ActionItem.DefaultActionItem.Invoke()
at System.Runtime.ActionItem.CallbackHelper.InvokeWithoutContext(Object state)
at System.Runtime.IOThreadScheduler.ScheduledOverlapped.IOCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* nativeOverlapped)
at System.Runtime.Fx.IOCompletionThunk.UnhandledExceptionFrame(UInt32 error, UInt32 bytesRead, NativeOverlapped* nativeOverlapped)
at System.Threading._IOCompletionCallback.PerformIOCompletionCallback(UInt32 errorCode, UInt32 numBytes, NativeOverlapped* pOVERLAP)

System.ServiceModel.Security.SecurityNegotiationException: Cannot find the negotiation state for the context 'uuid-59d97ce1-198c-4095-bd0a-825cf4eb580c-56'.




But when I hard code the address of one Application server inside my web application and run it works fine. The only problem is when I try to run the application across Load balancer it doesn’t work. Can any one explain why this is happening? We need to put the code in live sever ASAP when we are not able to crack this issue.

Thanks for your help in advance.