-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graph Template Items may have duplicated entries #4363
Comments
Hi Developers , This ticket is opened by my colleague . yes we are doing a clean up operation in our cacti . there are many graphs with missing legend like |query_ifAlias| , |query_ifHighSpeed| or rrd file is not being updated , may be because the db connection was broken at some point . so we though of finding such graphs and recreating a new one and merging the data of old graphs (ie doing mv old_rrd new_rrd) . This is being accomplished by reindexing through cli and then merging the old_rrd with new_rrd with a simple mv command after finding out similar graph title. This cacti has some inbuilt mechanism which helps it to find out whether the data is missing or |query_ifAlias| or |query_ifHighSpeed| is missing then it creates new graph . After the new graph is created we find duplicate graph title and merge old_rrd with new_rrd . PFA snap But what we are observing now is that whenever cacti does reindexing and we merge the data , we find the legends or gprint all are duplicated like , curr val , max val ,avg val or be it legend like |query_ifHishSpeed| or |query_ifAlias| and that too in a very haphazard manner which makes the values in grpint almost illegible. And this happens only when there is already a graph is existing and its data is merged with new one because of the above mentioned anomaly .But if the reindex creates graph for interfaces whose graphs are not there then it is all ok like gprint and legends are all ok . so we suspect may be there is some kind of bug in reindex (because what we do is like mv old_rrd new_rrd and I am pretty sure it does not contain gprint or legends) . Can you please help us like which part of reindex we need to debug ?? Best Regards, |
Use the graph debug options to see the RRDtool command being used, this is often a big hint as to what is occurring. |
Hi , I tried that it says all OK . But even in rrd ttol debug option you will see all items getting duplicated and what surprises me is that in the template we do not have this. PFB rrd tool debug ouput (the device name and query_ifalias have been changed) /bin/rrdtool graph - Does the above output gives you any clue ?? Best Regards, |
Hi Team , is the above info OK ?? Or do I need to provide any other info ?? Best Regards, |
Can you post a picture of your graph template definitions? If you are on 1.2.18, this will show the CDEF and GRPINT selections in the table view. For earlier versions, you'd have to go into each item. |
When you are editing the graph, does that show the same definition? |
Have you tried to re-apply the template to the graph? |
HI I tried re applying graph template but no luck . It's still the same , yes i had put cacti in debug mode from settings so that i can capture something helpful . Below is the log 2021-09-02 18:36:19 - WEBUI Obtaining 'Graph Template' cache can we have a call where you can look more closely and deeply into the issue and I invite you and Engineering from my side . |
When re-importing the template, did you select the 'Remove Orphans' option? Don't do this unless you have a good backup of the graph_templates_item table. |
Hi Team , can we have a call together may be tomorrow at 11:00 AM CET or you may suggest a right time which suits you and I drop a mail to you with invitation and yes is there any probable date of release of 1.2.19 . many thanks, |
Keep your eye on this page: https://github.com/Cacti/cacti/milestones?direction=asc&sort=due_date If anything changes, we update the milestones. |
Thanks Witness for clarifying the things . yes in snap I see you are importing only Preview , am I supposed to import only preview or the whole template ?? and yes This template "Interface Traffic is already in our production " so we do not have to import from any other environment .so below is my questions And is it really required as I am already having the template in prod and whenever I create new graphs it renders perfect graph , it creates mess only when I do reindexing . Please suggest further. Best Regards, |
Remove orphans is only when your Template is Virgin and the way you want all your graphs to look. It will remove any graph items that don't match your golden template. So, it's important to export templates when they get to the point you want them to be. Otherwise, you might bring in a stray from the field that breaks all your customization's. |
You need to first ensure that the template is "exactly" like you want it. Edit the Template and see if there are duplicate there. My guess is that there are. Also, can you check a few graphs and see if they are all like this or just a few of them. We might need a database dump to drill down on this more. If the Template itself is damaged, fix it, and then Export it and Re-import it using the Remove Orphans option. |
Thanks Larry , Actually I have observed the graph template many times and there are no duplicates, may be your feeling stems from the image I have provided in this ticket . And yes if the template is broken then all Regarding dump of database shall many thanks and yes have a nice weekend and yes please take some rest and spare some Best Regards, |
Upload the template image and xml file. I just need to see the first two sections of the template. Thanks! |
Hmm, you already uploaded it. Okay. So, wanting that template XML. |
Run this query and replace local_graph_id = ? with one of the impacted local_graph_id's SELECT id, hash, local_graph_template_item_id LGTII,
local_graph_id, task_item_id TII, color_id, graph_type_id type,
cdef_id, text_format, sequence
FROM graph_templates_item
WHERE local_graph_id = ? |
Then, run this one, replacing ? with the graph_template_id SELECT id, hash, local_graph_template_item_id LGTII,
local_graph_id, task_item_id TII, color_id, graph_type_id type,
cdef_id, text_format, sequence
FROM graph_templates_item
WHERE local_graph_id = 0
AND graph_template_id = ? |
HI Larry , tried to upload xml file but couldn't due to some restriction in place on this portal , then changed the extension to .txt to avoid but couldn't succeed , so put the whole text here that also didn't work as i do not know why the tags gets auto removed . so i sent the file on your mail . if you could please put it here , that will be great for community . for db query i update you soon. |
Hi Larry, can we schedule a meeting today or may be tomorrow and I can come in screen sharing mode with you and you can debug and reproduce the issue as it has become a blocker for our clean up task . |
Once you have a working test environment with the issue present, I'll spend some time. In the mean time, there is nothing for me to do here. |
Hi Larry , I tried to sync my prod with test . so I took the backup of whole db step 1. mysqldump -u username dbname-prod > prod-dump.sql and then imported the whole into test db . step 2 . cat prod-dump.sql |mysql -u username -p -D dbname-Test but the problem I am facing here is that when db from prod to test is ported , it ports even the information about pollers and devices and their association with pollers which are actually from prod . and the poller info of test gets overwritten with that of prod . SO the first thing I did after porting the prod db to test is to disable all the devices in test to avoid being polled from other devices . step 3 . update host set disabled='on'; now I wanted that all the devices which were reachable from Test , they should be enabled and polled from test env servers . step 4. insert into poller (id,name) values (random-number ,testserver); then transfer all the devices to testerver1 poller . step 5. update host set poller_id=random-number; then enable those devices in test which were up before this migration . step 6 .update host set disabled='' where id in (list if ids separated by comma); But the challenge here is that , testserver1 is added as remote poller not as main poller .so how to make it main poller . I tried some other methods like . I did the installation on test from fresh assuming that db will remain same and testserver1 will be registered as main server (as it happens in case of upgrade)but that did not work and broke all connection in db and i had to repeat the db restoration to cacti Test db. can you please help me here in bringing my prod in sync with test . many thanks in advance |
On your test system, you don't even need to enable the poller, we just need to have a snapshot of the database and the RRDfiles copied over for the testing. I'll open a block of time on Friday if you have things ready to go. Just don't enable the poller and this will be easy. |
Hi Larry , we have the db and rrd restored there in test from prod . So yes let me know when you are available today ?? I would suggest 1400 CET , I know its on a very short notice , let me send an invitation and if it's not OK , you suggest me suitable time . Best Regards, |
I'm at the Dr. Office then. Maybe 15:50. |
OK Larry . done !! i will send you new invitation at 15:50 CET . |
Gopal, how many rows return from this query? SELECT local_graph_id, local_graph_template_item_id, COUNT(*) as totals
FROM graph_templates_item
WHERE local_graph_template_item_id > 0
AND graph_template_id = ?
GROUP BY local_graph_id, local_graph_template_item_id
HAVING totals > 1; |
HI Larry , It returns 0 rows , in place of ? i used graph id of 2 defective graph and both returns 0. MariaDB [cacti]> SELECT local_graph_id, local_graph_template_item_id, COUNT(*) as totals MariaDB [cacti]> SELECT local_graph_id, local_graph_template_item_id, COUNT(*) as totals MariaDB [cacti]> SELECT local_graph_id, local_graph_template_item_id, COUNT(*) as totals Best Regards, |
It must be late. I was looking for the count of how many items are broken for that template, replace "graph_template_id = ?" with the actual graph_template_id not the local_graph_id |
ah my bad , just replaced it with graph_template_id of broken template . and there are 110648 rows . 110648 rows in set (18.91 sec) and yes its not very late :) just 10:51 PM local time may be i was too excited for the resolution or to run the query which made me do this mistake ..hahah |
Cool. Sent you a message on an alternate channel. Check your email. |
Pretty sure this can be closed now. |
I stumbled onto this one myself. I am prepping a change that will allow the repair when you |
Duplicate entries in graph_templates_item - mabye an aftermath of the template edit bug
In this commit I moved the repair into the repair_database.php script as well as added a new option for a Quick Sync which will only focus on Graph's that have a differing number of Graph Items from the Template.
@gj00354347 You should be able to test now, but it is best if you update the entirety of the 1.2.x branch. |
You should be able to run a |
Describe the bug
As i understand it, a template declaration inside the DB is identified by the hashes being set and local_graph_template_item_id=0 and local_graph_id=0.
While a real graph contains no hash and has local_graph_template_item_id pointed to the row that contains the template declaration from above and local_graph_id pointing to the graph declaration, as well a task_item_id for the elements of a graph, which are connected to a queried item.
In our DB i have found some strange duplicate entries, which relate to different local IDs than the original template.
Here's an example .. watch how for example in row id=10488739 there is a duplicate of the previous row, but the local_graph_template_item_id LGTII points suddenly to a different "master ID". Instead of the correct entry 477305 it uses now 10488701, which belonged to a graph but is not even there any more, which can in turn probably cause trouble when updating a graph template and the changes are not propagated properly into all graphs due to this shift in relation.
And there are many more of those duped rows with TII=0 for the same template, with always the same pattern.
I still have to check our other templates, there were a few more which showed this pattern.
And there was another problem with an apparently deleted graph of the same template, which made the template itself contain all duplicate items, because the above mentioned IDs were set to zero, hence cacti identified those entries as part of a template.
I fixed that already simply by removing those problematic entries.
Fixing should be halfway doable by scanning the DB for entries with identical sequence numbers for the same local_graph_id and deleting the duplicates where the task ID is zero ... i hope at least that there is no other relation which needs to be updated in other tables. If you can you help me out on that, then i can try to create a script which scans for such entries and provide it on my github repo, so that others can use it to check their DB and correct the entries if necessary.
What i have also seen is that the entry with id=1 in the table graph_templates_graph was overwritten with some other graph template's data, this looks like a related problem - not sure if that was fixed in the meantime, you have more insight than me probably. Would be nice if someone could enlighten me here as well.
I feel that this could this be an aftermath related to the old template edit bug that was fixed in 1.2.17 - #4237
Thanks!
The text was updated successfully, but these errors were encountered: