Tango Device-Server and Client Communication Issues

Main aim of the topic is how client application can continue to perform its scheduled tasks or execute function in case of the Tango Device server is not communicating due to some reason such as network is not accessible or Device-server machine is hang.

I am using CPP/PyTango 9.4.2 , Jtango 9.7.4 versions on Ubuntu 20.04/22.04 LTS platform

I am trying to run a client application which prints device-time (‘devtime’ attribute), and ‘state’ of the Device-server per second. Also receives expected “change-events” per 10 second from the device-server when the ‘counter’ attribute value changes by ten.

The test is performed with four cases :

  • Case-1 : Device Transparency - True, Device Timeout - Default ( 3 seconds)
  • Case-2 : Device Transparency - False, Device Timeout - Default ( 3 seconds)
  • Case-3 : Device Transparency - True, Device Timeout - 5 Milli-seconds ( Tuned when DS is disconnected)
  • Case-4 : Device Transparency - False, Device Timeout - 5 Milli-seconds ( Tuned when DS is disconnected)

Queries or Issues :

  1. In case of the ‘Transparency’ set True using the device proxy in the client program, the Client application hang for five to six minutes when the device-server is disconnected for more than 10 minutes. But in case of ‘Transparency’ set to ‘Fail’, client application don’t hang. In both the cases, Tango DS connection get establish automatically when the network is established. Hence, if exception handled correctly is it safe to have transparency set to ‘fail’ ?

  2. In both the Transparency set to Fail or True, the Client Exception after trying to read the state
    has interval of 6 to 7 seconds (with default timeout) or 4 to 3 seconds ( Timeout set 5 ms) which is more than the expected by ‘3 to 4’ seconds.
    Is it possible to avoid this extra 3 to 4 seconds ?

  3. After re-connection, two events are received from the device-server. This is unexpected, is it resolved in recent Tango version ( probably recent Tango release 10.1 mentioned about resolving this in case of when the two clients are connected to device-server).

  4. To avoid client application hang, which is better way ?
    (a) Set transparency to ‘False’ value, and set device timeout to 5 Milli-second when the
    device is not communicating, and restore default timeout when it is communicating.
    OR
    (b) Remove the device-proxy and try for fresh device-proxy to reconnect.

I am uploading the test-report here for your reference.
TangoDS_Comm.pdf (320.6 KB)

Thank you,
Jitendra

Hi Jitendra.

Thanks for the very detailed report. The behaviour does seem strange, sorry. You are only using JTango, so hopefully one of the JTango experts can provide some insight. I can’t answer all your questions.

  1. Hence, if exception handled correctly is it safe to have transparency set to ‘fail’ ?

Yes, you can set Transparency to False, if you catch the exceptions everywhere you use the DeviceProxy.

  1. After re-connection, two events are received from the device-server. This is unexpected, is it resolved in recent Tango version ( probably recent Tango release 10.1 mentioned about resolving this in case of when the two clients are connected to device-server).

That issue only applies to cppTango (and PyTango, which uses it). It would not affect JTango.

I see the code uses setDev_timeout and set_timeout_millis. I don’t know what setDev_timeout is for.

Maybe you can get some more clues by enabling debug logging for the JaCORB library. See Logging - JTango User Manual and Annexes - JTango User Manual

It you use ATK Panel to monitor your device, does it have similar behaviour? E.g., very long “hanging” (5 to 10 minutes), updates only after 6 seconds when device restarted, etc.?

/Anton

Hello Anton,

Thanks for your response and hints about further debugging. Here are my quick feed backs -

1) Yes, you can set Transparency to False, if you catch the exceptions everywhere you use the DeviceProxy.
=> I am catching and exception carefully, hence I will prefer to set the Transparency to ‘False’, as it can avoid client application hang.

(3) That issue only applies to cppTango (and PyTango, which uses it). It would not affect JTango.
=> In the log JTango is showing two events one is stamped with device-server IP and other without it. In the report , I gave the log

I see the code uses setDev_timeout and set_timeout_millis. I don’t know what setDev_timeout is for.
=> setDev_timeout is for dserver device ( admin-device), the purpose of setting this as well so that KeepAliveThread and other Java thread which connects DServer shall not take default timeout of 3 seconds, instead it shall take the set timeout for e.g. 5 Milli-seconds

It you use ATK Panel to monitor your device, does it have similar behaviour? E.g., very long “hanging” (5 to 10 minutes), updates only after 6 seconds when device restarted, etc.?
=> Yes, ATK Panel / Jive show that the application do hang for 5 to 6 minutes as mentioned in the test-report.

Jitendra