Citation preview incrementally increases memory usage with each preview #2533

kingwarrick · 2017-02-09T18:49:59Z

JabRef 3.8.2
windows 10 10.0 amd64
Java 1.8.0_121

RAM is used during preview, but not fully released. With more previews, less is released until run-away usage occurs. After ~4-5 previews (or ~500MB RAM, not sure which is more significant), no appreciable amount of RAM is released, with subsequent previews adding 100-500MB per preview to total RAM usage. Behaviour changes after ~1400MB RAM use, with slower incremental use and more frequent release of small amounts of ram.

Tested with different preview styles (no change); interesting to note how CPU use in IEEE preview generation spikes to ~40%, whereas APA spikes to ~20%.

NOT tested with any release versions.

Steps to reproduce [RAM use in square brackets]:

Open JabRef and bibtex database (~900 entries)
Single click on entry to show preview box, wait for preview to show
Single click on another entry, wait for preview to show

This shows in the exceptions tab of the error console:

ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.6ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.6ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.6ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.6ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.6ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.6ANTLR Tool version 4.5.3 used for code generation does not match the current runtime version 4.6ANTLR Runtime version 4.5.3 used for parser compilation does not match the current runtime version 4.6

The text was updated successfully, but these errors were encountered:

kingwarrick · 2017-02-09T19:08:53Z

Similar to this:
#2247
but for preview rather than entry editing

tobiasdiez commented on Nov 8 2016
I can confirm at least the large memory footprint. After opening a normal db (mine has 500 entries and a few groups) JabRef eats only 200 MB of RAM. Now open the entry editor and run through a few entries (using the arrow keys so that a new entry editor is generated for the entries). Result: over 1 GB of RAM usage. Btw: the same db with 3.6 only needed < 100 MB RAM.

koppor · 2017-02-10T07:40:41Z

Yes, we cache the preview to gain speed when showing the second time. @bartsch-dev What is our policy for cache clearance?

lenhard · 2017-02-28T16:59:49Z

The mention did not work, so here again @bartsch-dev What is our policy on cache clearance?

chriba · 2017-02-28T17:26:47Z

The cached citation gets deleted when a field of the BibEntry changes or the BibEntry itself is removed (or the selected citation style changes). https://github.com/JabRef/jabref/blob/master/src/main/java/org/jabref/logic/citationstyle/CitationStyleCache.java#L59

It gets generated again the next time it's selected by the user and the preview is active.

koppor · 2017-03-27T15:07:30Z

This issue will be solved if we remove the cache and finally integrate #2250 in the master branch.

LinusDietz · 2017-04-04T12:34:17Z

Sorry, can't confirm with current master 1f893cd under linux with

openjdk version "1.8.0_121"
OpenJDK Runtime Environment (build 1.8.0_121-b13)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)

I opened a 6k entry bibtex database and performed the above described steps. RAM never went beyond 425MB before the Garbage Collection kicked in.

lenhard · 2017-04-05T07:23:48Z

Thanks for testing @lynyus!

@kingwarrick Please give a it a try with the latest version of JabRef from http://builds.jabref.org/master/ Your problem seems to be solved in the most recent dev builds. Please do this with a backup of your bib file since there are a number of breaking changes regarding groups. Let us know if your experiences!

kingwarrick · 2017-04-05T16:02:06Z

Hello, The issue persists on my machine, both linux (openjdk, openjfx) and Windows 10 JabRef 4.0.0-dev--snapshot--2017-04-05--master--a1f4101df Windows 10 10.0 amd64 Java 1.8.0_121 Let me know if I can help in troubleshooting further. Warrick

…

On Wed, 05 Apr 2017 00:23:50 -0700 Jörg Lenhard ***@***.***> wrote: Thanks for testing @lynyus! @kingwarrick Please give a it a try with the latest version of JabRef from http://builds.jabref.org/master/ Your problem seems to be solved in the most recent dev builds. Please do this with a backup of your bib file since there are a number of breaking changes regarding groups. Let us know if your experiences!

matthiasgeiger · 2017-04-05T16:41:23Z

Can reproduce the high RAM and CPU usage if one of the CSL Styles is used for preview. Performing a manual garbage collection using jvisualvm reduces the amount of used heapspace - but this is not freeed for the OS.

I assume we cannot really do anything about this - potentially #2250 will solve this but I'm not Sure...

koppor · 2017-04-05T18:18:17Z

Yes, it will be solved. However, currently no one has time to work on the build process. Hopefully after the alpha release. Let's see.

koppor · 2017-04-06T12:32:05Z

@lynyus has worked on the caching of citation styles and it should get better.

@kingwarrick Is there any upper bound of memory usage? - There should be...

LinusDietz · 2017-04-06T12:42:50Z

Actually I have not. I implemented the caching in the FileAnnotationCache.

kingwarrick · 2017-04-06T13:00:56Z

I have not experienced an upper bound, but have seen it climb to ~1900MB before I stopped testing. The increase is not linear, but seems to occur in bursts. Not sure if it is entry-specific; I will try to be more systematic to see if there are specific entry types that use more memory. I saw some graphs in an issue thread that showed memory use over time (can't find it now, but it dealt with garbage collection). Can you or someone direct me on where to get/how to use that tracing/debugging program? This would be easier for me (and maybe more informative for you) than relying on task manager and top. Edit: This graph: #2247 (comment)

…

On 06/04/17 06:32 AM, Oliver Kopp wrote: @lynyus <https://github.com/lynyus> has worked on the caching of citation styles and it should get better. @kingwarrick <https://github.com/kingwarrick> Is there any upper bound of memory usage? - There should be... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2533 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AWd9NoL3O955W9hoxaT_MWbKLNf58jm8ks5rtNtGgaJpZM4L8hHB>.

matthiasgeiger · 2017-04-06T13:55:21Z

That's jvisualvm (just search in the start menu) which is shipped with the Oracle JDK.

kingwarrick · 2017-04-06T15:41:56Z

Hopefully this will help:

koppor · 2017-09-01T06:07:13Z

The PR #3089 should have introduced a limit of the memory consumption. Could you please check again? I hope, that not each preview adds 100 MB of RAM. If it does, we have to really limit the number of cached entry previews. Currently, there are 1024 entries cached.

kingwarrick · 2017-09-01T13:45:11Z

JabRef 4.0-dev--snapshot--2017-09-01--master--e46d5ee88
Windows 10 10.0 amd64
Java 1.8.0_144

First test (preview one entry at a time, advancing after display; no screenshot):
Memory usage increased significantly for the first 15 previews up to ~1500MB (same as prior), peaked around 1700MB briefly, with memory usage hovering around ~1550MB after preview 100.

Second test (quickly advance through 3-10 entries, triggers preview for more than one entry; screenshot attached):
Memory use increases rapidly, peaking at ~2200MB and hovering around 2000MB.
The time-to-display of the preview is much quicker after about 20 previews.

sbitzer · 2017-10-16T15:30:36Z

I can confirm that the 4.0 release has severe memory issues under Linux. Just opening my database with Jabref 4.0 makes Java use 984 MB of RAM. The first search makes this go up to 1.2 GB. Stepping through the first 20 shown entries brings this to 2.1 GB. After adding 5 articles this morning and searching some others Jabref was at 3.2 GB. That's not really practicable. How can caching something like a 300 character-string (citation preview) cost several 10s of MB of RAM?

JabRef 4.0
Linux 3.16.0-38-generic amd64
Java 1.8.0_144

lenhard · 2017-10-17T12:22:23Z

To be fair on the 4.0 release, this problem was present well before that :) And of course caching a 300 character string does not take 10s of MB of RAM, obviously JabRef stores more than that (and I am in no way saying that's good).

Given that the CSL cache seems to be totally non-functionally, can we simply disable/remove it? Or limit the cache to, say, a 1 digit number of entries? I know that this will mean that displaying the previews will take longer, but the memory consumption described here is far beyond anything justifiable. I have no deeper knowledge of how the previews work internally, but maybe one of the developers who contributed to it can clarify?

@sbitzer the quickfix (mentioned above) seems to be to not use citation styles for the preview.

EDIT: I think we should do something here soon and added it to the 4.1 milestone.

tobiasdiez · 2017-12-06T16:16:46Z

@halirutan do you have time to look at this issue? Your analysis in other performance related issues was superb. Of course, I can help fixing the code.

halirutan · 2017-12-06T16:25:20Z

@tobiasdiez I already profiled this. I was working the last weeks on my Mathematica plugin which explains my absence. I have two issues on my list that I want to look at. It's this one and the disastrous behavior of the group-view for moderately nested group-trees.

When my memory serves me right, I couldn't come to a definite conclusion with this issue. It is a combination of the several string-builders. I will try to repeat the profiling to find out if there is a point where we can make a worthy impact on the memory consumption

halirutan · 2017-12-09T05:38:20Z

@tobiasdiez @sbitzer @lenhard @kingwarrick I cannot confirm the high RAM usage with the latest master c51ecbe. I used a large database (several thousand entries) and scrolled through 500 of them, showing the preview each time in IEEE style. The preview is the biggest contributor, but it used less than 90MB of RAM. This comes down to 180kb per preview. See the highlighted line

However, one thing that I find highly suspicious is that the simple call to setting the text in the JPane with javax.swing.JEditorPane#setText in these lines uses 52 MB of the overall 86MB.

Can someone on a machine that shows the weird behavior test to comment out the actual setting of the preview string?

    public void setPreviewLabel(String text) {                  
        if (SwingUtilities.isEventDispatchThread()) {           
            final Document document = previewPane.getDocument();
//            previewPane.setText(text);                        
//            previewPane.revalidate();                         
        } else {                                                
            SwingUtilities.invokeLater(() -> {                  
//                previewPane.setText(text);                    
//                previewPane.revalidate();                     
            });                                                 
        }                                                       
        this.scrollToTop();                                     
    }

This leaves the whole machinery of JabRef creating the preview string alive but shoves the JPanel out of the memory picture. Scroll slowly bit by bit through the entries with preview open and see what your memory does.

I'm on Ubuntu 16.04 with Oracle JRE.

tobiasdiez · 2017-12-09T10:15:04Z

@halirutan Thanks for the analysis. 180kb per preview sounds fine. So with the current cache size of 1024, there shouldn't be any problems since the total memory consumption is then around 200 mb. So how does these astronomical reports of > 2 gb come about?

It is understandable that the the JEditorPane has quite some overhead. In the end it is a full webbrowser and the simple string has to be parsed to a html tree and then displayed. Maybe the migration to JavaFX #3504 helps in this regard a bit.

lenhard · 2017-12-09T19:57:08Z

@halirutan @tobiasdiez If I am not mistaken, the memory issue happens only when using citationstyles, not the default preview or the IEEE preview.

koppor · 2017-12-09T23:06:24Z

@lenhard That's true.

halirutan · 2017-12-11T01:31:18Z

Here comes the revised performance analysis. I am not certain, but when I understand this correctly, then the cache is not the problem. The main drain of resources seems to be the CSL library. Let us first look at the memory. Note that the profiler does not track all objects which is why there is a large difference between the size in line 1 and the size that the CSL lib uses. What I tested is the creation of exactly one preview that uses one of the provided csl styles and not the default one.

This is the memory footprint:

As you can see, 98% of the memory is eaten up by creating the styled bib-entry through the CSL library. Notably, 63% of it only by constructing the CSL instance inside makeAdhocBibliography. When we look at the comment of this function, it states:

Creates an ad hoc bibliography from the given citation items. Calling this method is rather expensive as it initializes the CSL processor. If you need to create bibliographies multiple times in your application you should create the processor yourself and cache it if necessary.

This is by far no understatement. I understand that using makeAdhocBibliography is a nice way of feeding only one bib-item and creating an html-view of it, but it comes at a steep price. I have quickly looked through the API and it seems this is rather optimized for handling and rendering whole bibliographies instead of a fast preview for single items.

If possible at all, I would advise to hold our own CSL instance and reuse it. If we look at the performance, we get the same picture. Please don't be confused by the overall 25%. I left the application run idle for a moment so a lot of time was spent only waiting for me. The 25% is the time where the hard work was done.

What you should take from this is that from the 25% of the time that was needed to preview the item, 18% were spent only constructing the CSL instance. That is tremendous (50 out of 70 seconds under profiling!). The code for makeAdhocBibliography is short enough to share it:

public static Bibliography makeAdhocBibliography(String style, String outputFormat,
		CSLItemData... items) throws IOException {                                 
	ItemDataProvider provider = new ListItemDataProvider(items);                   
	CSL csl = new CSL(provider, style);                                            
	csl.setOutputFormat(outputFormat);                                             
	                                                                               
	String[] ids = new String[items.length];                                       
	for (int i = 0; i < items.length; ++i) {                                       
		ids[i] = items[i].getId();                                                 
	}                                                                              
	csl.registerCitationItems(ids);                                                
	                                                                               
	return csl.makeBibliography();                                                 
}

There are 3 parts in this method. First, the CSL instance is created. As you can see, the ListItemDataProvider is used to construct the CSL instance and it already contains our 1 item that we want to process. So if this is possible, we should use our own data provider that is updated with the current item, but I looked for the first time at this library and I'm not sure this works.

There are 2 other parts. The for loop just copies the ids which is hopefully fast. The last part consists of registerCitationItems and makeBibliography. Both methods come with a very notable performance and memory footprint as well as you can see in the images, but not as large as the constructor and I'm not sure how much this can be improved.

My very naive view is: They need to initialize the whole machinery and they need to parse the csl style. I understand this is hard work. But then they have everything set up and we provide CSLItemData so they only need to render it according to the style. Why are both final methods so expensive?

Anyway, my suggestion is to leave the cache alone or turn it of. More important is that we initialize the CSL engine and reuse it as much as possible.

koppor · 2017-12-11T07:20:52Z

Another quick way to add a workaround is to switch to the V8 engine. See #3180. Especially the screen videos with V8 and without V8.

tobiasdiez · 2017-12-19T17:37:00Z

This should be fixed in the latest development version. Could you please check the build from http://builds.jabref.org/master/. Thanks!

sbitzer · 2017-12-21T11:50:35Z

I just tried it out. Stepping through the first 50 entries of my database producing APA-style previews for them, increased the memory usage of Jabref from 915 MB to 1.4 GB. That's certainly less than before, but it's still a lot. For comparison: Doing the same with the standard entry preview raises memory usage only from 915 MB to 920 MB.

JabRef 4.1-dev--snapshot--2017-12-21--master--7819cb4aa
Linux 3.16.0-38-generic amd64
Java 1.8.0_151

koppor · 2017-12-21T11:57:17Z

How is the rendering speed? Maybe, we can drop the caching of rendered entries completely? Am 21.12.2017 12:50 schrieb "Sebastian Bitzer" <notifications@github.com>:

…

I just tried it out. Stepping through the first 50 entries of my database producing APA-style previews for them, increased the memory usage of Jabref from 915 MB to 1.4 GB. That's certainly less than before, but it's still a lot. For comparison: Doing the same with the standard entry preview raises memory usage only from 915 MB to 920 MB. JabRef 4.1-dev--snapshot--2017-12-21--master--7819cb4aa Linux 3.16.0-38-generic amd64 Java 1.8.0_151 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2533 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABTafqdDCcUp2nCPcqByGgd1Lwv-Ou-jks5tCkYPgaJpZM4L8hHB> .

sbitzer · 2017-12-21T13:02:46Z

The first entry preview takes a while to be generated. The others are almost instantaneous.

halirutan · 2017-12-21T14:30:26Z

@sbitzer The first entry takes exactly as long as every entry needed before the patch. It initializes the CSL engine before it can render a preview. After that, we reuse it and until you change the style, every preview is fast. On a style change, the first item will trigger a re-initialization again.

@koppor I am certain the cache is not the problem and in fact, it saves us from using even more memory. Look here:

Simply running through 20 items and showing the preview uses 87% of the memory. 43% is spent in CSL.makeBibliography, 39% is spent in registering the item in the CSL engine and only 4% is spent by JabRef creating the format for CSL. I'm sorry to say this, but this is as far as I can optimize it.

The real pain under the hood is the NashornScriptEngine that is afaik used to run the JS code.

kingwarrick changed the title ~~Citation preview incrementally increases memory usage for each preview~~ Citation preview incrementally increases memory usage with each preview Feb 9, 2017

lenhard added the status: waiting-for-feedback The submitter or other users need to provide more information about the issue label Feb 28, 2017

LinusDietz self-assigned this Mar 27, 2017

koppor removed the status: waiting-for-feedback The submitter or other users need to provide more information about the issue label Apr 6, 2017

LinusDietz mentioned this issue May 12, 2017

JabRef 4.0.0-beta needing way more ressources than v.3.8.2 #2837

Closed

koppor mentioned this issue Jul 18, 2017

Use Eclipse Collections #3023

Closed

lenhard added this to the v4.1 milestone Oct 17, 2017

lenhard added component: entry-preview dev: performance labels Oct 23, 2017

AEgit mentioned this issue Oct 24, 2017

Performance issues with open entry editor #3175

Closed

tobiasdiez mentioned this issue Dec 8, 2017

[WIP] Convert entry preview panel to JavaFX #3504

Closed

6 tasks

This was referenced Dec 12, 2017

Help wanted: Improving performance when repeatedly creating a one-entry bibliography michel-kraemer/citeproc-java#40

Closed

Fix preview performance #3533

Merged

tobiasdiez closed this as completed in #3533 Dec 19, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Citation preview incrementally increases memory usage with each preview #2533

Citation preview incrementally increases memory usage with each preview #2533

kingwarrick commented Feb 9, 2017 •

edited

Loading

kingwarrick commented Feb 9, 2017

koppor commented Feb 10, 2017 via email •

edited

Loading

lenhard commented Feb 28, 2017

chriba commented Feb 28, 2017

koppor commented Mar 27, 2017

LinusDietz commented Apr 4, 2017

lenhard commented Apr 5, 2017

kingwarrick commented Apr 5, 2017 via email

matthiasgeiger commented Apr 5, 2017

koppor commented Apr 5, 2017

koppor commented Apr 6, 2017

LinusDietz commented Apr 6, 2017 via email

kingwarrick commented Apr 6, 2017 via email •

edited

Loading

matthiasgeiger commented Apr 6, 2017

kingwarrick commented Apr 6, 2017

koppor commented Sep 1, 2017

kingwarrick commented Sep 1, 2017

sbitzer commented Oct 16, 2017

lenhard commented Oct 17, 2017 •

edited

Loading

tobiasdiez commented Dec 6, 2017

halirutan commented Dec 6, 2017

halirutan commented Dec 9, 2017 •

edited

Loading

tobiasdiez commented Dec 9, 2017

lenhard commented Dec 9, 2017

koppor commented Dec 9, 2017 via email •

edited

Loading

halirutan commented Dec 11, 2017

koppor commented Dec 11, 2017

tobiasdiez commented Dec 19, 2017

sbitzer commented Dec 21, 2017

koppor commented Dec 21, 2017 via email

sbitzer commented Dec 21, 2017

halirutan commented Dec 21, 2017

Citation preview incrementally increases memory usage with each preview #2533

Citation preview incrementally increases memory usage with each preview #2533

Comments

kingwarrick commented Feb 9, 2017 • edited Loading

kingwarrick commented Feb 9, 2017

koppor commented Feb 10, 2017 via email • edited Loading

lenhard commented Feb 28, 2017

chriba commented Feb 28, 2017

koppor commented Mar 27, 2017

LinusDietz commented Apr 4, 2017

lenhard commented Apr 5, 2017

kingwarrick commented Apr 5, 2017 via email

matthiasgeiger commented Apr 5, 2017

koppor commented Apr 5, 2017

koppor commented Apr 6, 2017

LinusDietz commented Apr 6, 2017 via email

kingwarrick commented Apr 6, 2017 via email • edited Loading

matthiasgeiger commented Apr 6, 2017

kingwarrick commented Apr 6, 2017

koppor commented Sep 1, 2017

kingwarrick commented Sep 1, 2017

sbitzer commented Oct 16, 2017

lenhard commented Oct 17, 2017 • edited Loading

tobiasdiez commented Dec 6, 2017

halirutan commented Dec 6, 2017

halirutan commented Dec 9, 2017 • edited Loading

tobiasdiez commented Dec 9, 2017

lenhard commented Dec 9, 2017

koppor commented Dec 9, 2017 via email • edited Loading

halirutan commented Dec 11, 2017

koppor commented Dec 11, 2017

tobiasdiez commented Dec 19, 2017

sbitzer commented Dec 21, 2017

koppor commented Dec 21, 2017 via email

sbitzer commented Dec 21, 2017

halirutan commented Dec 21, 2017

kingwarrick commented Feb 9, 2017 •

edited

Loading

koppor commented Feb 10, 2017 via email •

edited

Loading

kingwarrick commented Apr 6, 2017 via email •

edited

Loading

lenhard commented Oct 17, 2017 •

edited

Loading

halirutan commented Dec 9, 2017 •

edited

Loading

koppor commented Dec 9, 2017 via email •

edited

Loading