Event loss issue

Dear all,

I’m experiencing problems with the event system in my application and I would like to ask you suggestions on how to debug and understand the error.
I have 3 C++ device servers with some attributes (created dynamically at device startup with Yat) and a C++ device client subscribing to CHANGE/USER/PERIODIC events from those attributes (about 20 attributes in total) with a unique callback. The maximum polling period for those attributes is 1000 ms. The client has a user thread where events are processed (e.g. event data received in the callback are passed to the worker thread).

The problem I’m facing is that after the client has run for a while some of the events are not received anymore. I suspect that I have some deadlock (+memory issue?).

One of the problem in the client, if I’m not wrong, it should be addressed to code that pushes events manually from a user thread.
Infact I occasionally saw these error messages in the client:


Tango exception   
Severity = ERROR   
Error reason = API_CommandTimedOut   
Desc : Not able to acquire serialization (dev, class or process) monitor   
Origin : TangoMonitor::get_monitor

and also:


Tango::ZmqEventConsumer::push_heartbeat_event() timeout on channel monitor of (dserver address)

and these (apparently) disappeared if I shut down event pushing in the user threads (e.g. leave only events generation from polling thread).
Could you give me more details on this Tango core messages?

However the issue with the event stop did not disappear.
I understand that it is very difficult being helped without code sketches or additional details, so I would like to ask first if you ever experienced the same (Tango version 9.2.5a, zmq v4.2.0, omni v4.2.1) and how to debug the problem. Is there any flag (e.g. related to zmq) I can turn on to explore the issue in detail both in the servers and in the client? Could it be a zmq issue (e.g. some queue full)?
Right now I’m preparing a very simple device client that just subscribes to the same events and monitor the received event rate to understand more but if you have hints that would help me a lot.

Thanks for your support,

Simone