Tango database

cscafuri · June 19, 2015, 12:55pm

Weird indeed.

I look at ot our databases:
the "oldest, installed in 2005, has two rows in the _history_id tables; the newer, installed in 2010 an dlaterhave 1 row in _history_id tables.

Jan, would you post your modified Databseds sources?

Regards

jkotanski · June 19, 2015, 1:20pm

Hi,

To get the idea what I’m doing now I attach my version od Databaseds for Tango 8.1.2.
Since I’m debugging it now it is full of printouts and dirty hacks which shouldn’t be used in standard code.
The main modifications are in DataBase::db_put_device_attribute_property2(…).

Bests,
Jan

cscafuri · June 19, 2015, 2:20pm

[quote=“jan”]Hi,

To get the idea what I’m doing now I attach my version od Databaseds for Tango 8.1.2.
Since I’m debugging it now it is full of printouts and dirty hacks which shouldn’t be used in standard code.
The main modifications are in DataBase::db_put_device_attribute_property2(…).

Bests,
Jan[/quote]

Thanks a lot. I will try it next week.

jkotanski · June 19, 2015, 2:27pm

I’ve just realized that if all attribute property transactions block each other by get_id().
So updating history_id should be put into separate transaction or history_id should be removed from mysql db.
Otherwise it is no point to use row locking from innodb.

agotz · June 21, 2015, 8:07pm

Hi Jan,

very interesting. I don’t know what exactly device_attribute_history_id is used for nor why it grows. In my test installation I see it has 5 rows and all have a value of zero. Strange. I thought it was intended to have a unique id for all history entries in the database but I don’t know why this is needed. And you are right it is a bottleneck for all requests. No need to parallelise them if they all get re-serialised by one table. I will ask the designers of the history feature of the database to explain why this is necessary and how to get around the limitations.

Anyway the fact that you don’t have any spikes is excellent news. It would be great to get a clean version of your patches for others to try out.

Cheers

Andy

jpons · June 22, 2015, 6:58am

Hi,

The history ids are used for indexing history of resources and should be unique only for a single property. If i remember well, at the beginning, I implemented this features without locking the table. In fact, this is not a problem if 2 history entries of 2 different properties get the same id. But someone (not me) has serialized this. The only problem that can happen with an unserialized get_id(), is when 2 different database servers write the same property at the same time. In that case you can have a wrong order in the history for this particular update (or in worst case a mixed value for arrays).

Jean-Luc

jkotanski · June 22, 2015, 6:59am

Hi Claudio,

I attach a newer version of my files. (In the previous there was a bug in a query).
In this version I put get_id into a separate transaction which seems to improve things.

Bests,
Jan

jkotanski · June 22, 2015, 7:57am

The spikes still exists but they are much smaller (up to 0.1 without TSM). They come mostly from updating id, delete/insert history, SELECT … order by or commit. The changes work only for setting attribute properties with innodb. So to change to innodb reviewing of Databaseds code and much more tests is needed.

Bests,
Jan

cscafuri · June 22, 2015, 10:34am

[quote=“jan”]Hi Claudio,

I attach a newer version of my files. (In the previous there was a bug in a query).
In this version I put get_id into a separate transaction which seems to improve things.

Bests,
Jan[/quote]

Thanks a lot.
Test are under way with your version Databeseds on my development machine.

Best regards

Claudio

cscafuri · June 23, 2015, 1:30pm

Update on performace tests: I compiled the patched source that Jan posted on the forums and reapetd my tests (see my other post):


test           average		min		max		
inno+trans	 4.207		0.523		140.678		
isam+lock	 8.31		0.54		175.253
patch+inno+trans 0.779          0.626           245.3

Please notice that durin the last tests I turned on loggoing of slow transactions , which may have slowed a little the db engine.
Seems that there is some average improvement.

Regarding:

UPDATE device_attribute_history_id SET id=LAST_INSERT_ID(id+1)

It may be possible to get rid of the *_history_id tables and declare the corresponding id column of the *history table as AUTO_INCREMENT. It requires some extra work on the server code but may result in a more compact code (may even fatser).
What do you think?

Regards

jkotanski · June 23, 2015, 1:31pm

Hi,

I would like to ask if it is possible to change create_db.sql script just to insert new rows into tables only if they not exist.
In the corresponding debian package this script is called ‘mysql’.
It may cause problems during reinstallation of this package by creating to many rows, e.g. in device or *_history_id tables.

Bests,
Jan

agotz · June 23, 2015, 1:55pm

Jan,

yes this is what happened in the case you had multiple (61) rows of *_history_id. On my laptop I have 5 rows. Do you have a patch for the create_db.sql script? Do you know how to drop all but one entry of a table? Just asking to see if you know the answer already.

Kind regards

Andy

jpons · June 23, 2015, 1:57pm

Hi,

It is not possible to use auto increment for history ids due to array values.
An interesting improvement would be to find a strategy to disable history for some properties.
Properties which change often and by program may not need a history.

agotz · June 23, 2015, 1:57pm

Claudio,

I agree with you it could be a good idea to get rid of the history_id table. Your idea of using AUTO_INCREMENT could do it. Another way could be to use the time of the update to order the history. Just an idea.

Kind regards

Andy

jkotanski · June 23, 2015, 2:01pm

Hi Andy,

No, I’ve not got any script or patch. Actually, now in DESY Hannes is working on scripts which should clean up all the mess in DB. But probably it will be DESY specific solution.

Bests,
Jan

jkotanski · June 23, 2015, 2:07pm

Hi Claudio,

Your test looks really nice. From my test I can add that usually maximum value is quite random. And in my patch there was a lot of print-outs (I don’t know if you have removed them or send to /dev/null) which are also a source of spikes.

Bests,
Jan

jpons · June 23, 2015, 2:28pm

Andy,

Using the time stamps may be slower than using ids (for history requests) as you will have to order by date and select on time/device/attribute/name instead of id. However for updates, we write without id (no locking problem) and the history cleaning might be faster but this solution does not solve the problem of multiple updates by several databadeds at the same time (which should be very rare).

Jean-Luc

jkotanski · June 24, 2015, 8:41am

I’ve taken a look on my sql script where I initialize a revision counter in a properties table for NeXus Configuration Server and my insert statement looks like

INSERT INTO properties (name, value) SELECT * FROM (SELECT 'revision', '0') AS tmp WHERE NOT EXISTS ( SELECT name FROM properties WHERE name ='revision' ) LIMIT 1;

We could do similar things for create_db.sql

etaurel · June 24, 2015, 2:48pm

Hello all,

There is something I do not understand.
The create_db script insert the history id in the XXX_id tables. I don’t think the server is able to add entry in these tables
because in its code there is only UPDATE command for these tables.
The create_db script start with the line

CREATE DATABASE tango

If the database already exist, this line fails and the script abort. I have just tried this on my Ubuntu host.
Therefore how is it possible to have several lines in these XXX_id tables if the create_db.sql script abort if the db is already created when it is executed?

Cheers

jkotanski · June 24, 2015, 3:34pm

Hi Manu,

The problem is not on the level of Databaseds but on the level of packages, i.e. .deb or .rpm, which usually use the create_db.sql script to install the tango database.

E.g. for debian, if you install tango-db and remove it (without cleaning the database) and then install tango-db again
you get duplicated rows.

Of course one can do it by changing the script in the package but it would be easier if we do it in the original script.

Bests,
Jan