2021-07-05 GnuCash IRC logs

00:10:08 *** qwer has joined #gnucash
00:47:53 *** Mechtilde has joined #gnucash
01:09:41 *** fell has quit IRC
01:11:00 *** fell has joined #gnucash
01:11:00 *** ChanServ sets mode: +o fell
01:24:36 *** o01eg has joined #gnucash
01:38:32 *** sbluhm has joined #gnucash
01:39:17 *** frakturfreak has quit IRC
01:51:23 *** jervin has joined #gnucash
01:53:34 *** frakturfreak has joined #gnucash
02:11:22 *** field^Zzz3 has joined #gnucash
02:26:06 *** Mechtilde has quit IRC
02:26:07 *** chris_ has joined #gnucash
02:26:08 *** gncbot sets mode: +o chris_
02:27:38 *** chris has quit IRC
02:28:17 *** chris has joined #gnucash
02:28:17 *** ChanServ sets mode: +v chris
02:28:22 *** gncbot sets mode: +o chris
02:30:02 *** chris_ has quit IRC
03:05:17 <Simon> jralls: which is why I don't want to be copying long linked lists for prices 140K times
03:28:42 *** celeste has quit IRC
03:40:38 *** celeste has joined #gnucash
03:40:38 *** ChanServ sets mode: +v celeste
03:48:29 <Simon> if it was a vector then it may only need a single memcpy; it's also sorted so binary searches would then be possible
03:59:53 *** AstronautSurfer has joined #gnucash
03:59:53 *** ChanServ sets mode: +v AstronautSurfer
04:01:04 *** AstronautSurfer has quit IRC
04:11:12 *** Aussie_matt has quit IRC
04:20:02 *** Mechtilde has joined #gnucash
05:34:27 *** storyjesse has joined #gnucash
06:21:14 *** field^Zzz3 has quit IRC
06:30:06 *** User has joined #gnucash
09:12:03 <Simon> I still have 2m calls to qof_log_check()
09:17:22 <Simon> ENTER() and LEAVE() in xaccTransCommitEdit 537k times
09:53:52 *** Bambuzel has joined #gnucash
09:53:52 *** ChanServ sets mode: +v Bambuzel
10:00:09 *** AstronautSurfer has joined #gnucash
10:00:09 *** ChanServ sets mode: +v AstronautSurfer
10:23:33 *** sbluhm has quit IRC
10:52:36 <Simon> 20% improvement with -flto
10:53:31 <Simon> time64_cmp should really be in a header file to ensure it gets inlined
10:54:03 <Simon> alternatively, callgrind is just able to identify it even when inlined
10:57:45 <chris> time64_cmp should return (a-b)
10:58:51 <chris> does your callgraph show lots of calls to gnc_price_list_insert ?
10:59:17 *** gnomey has left #gnucash
11:01:08 <chris> if so, then g_list_insert_sorted is an obvious O(N^2) problem
11:02:31 <chris> the time64 output doesn't need -1 0 +1 -- g_list_insert_sorted only cares about the sign
11:03:13 <chris> (beware overflow though)
11:08:17 <Simon> yes but time64_cmp() is in a .cpp file so it can only be inlined by -flto
11:08:37 <Simon> putting it in a .h file will allow it to entirely disappear
11:09:11 <Simon> yes g_list_insert_sorted and gnc_price_list_insert are called a lot
11:09:20 <chris> bingo
11:09:37 <chris> slowdown-town
11:09:38 <Simon> but "a lot" may or may not be significant
11:10:09 <Simon> I think it needs special-casing for the load process where the next price is *always* newer
11:10:14 <jralls> Simon, that's utter nonsense. That phase of compilation can't even tell what source file a definition is in, it works on compilation units after the preprocessor has combined all of the #includes.
11:11:06 <Simon> what? if it's defined in a .cpp file then the implementation may not be available at the point of use to be able to inline it
11:11:36 <Simon> although in this case it's all in the same file
11:12:17 <chris> Simon: check gnc_account_insert_split for similar approach with a special 'loading' editlevel
11:12:48 <jralls> Exactly. It's a local function for gnc-pricedb.c. Note that it's not even C++ here, but that doesn't matter for inlining, C and C++ work the same way.
11:22:22 *** o01eg has quit IRC
11:25:53 <Simon> gaining 0.8s (8.5%) by forcing it to prepend prices, it doesn't look like the editlevel is set on the price
11:26:45 <Simon> if they should always be at the start of the list the sort overhead should be minimal
11:27:56 <Simon> oh there's already a boolean for bulk updates that ignores duplicates
11:28:24 *** David has quit IRC
11:28:26 <Simon> old_price = gnc_pricedb_lookup_day_t64 (db, p->commodity, p->currency,
11:28:28 <Simon> p->tmspec);
11:28:30 <Simon> if (!db->bulk_update && old_price != NULL)
11:28:32 <Simon> well that's wrong...
11:28:41 *** David has joined #gnucash
11:28:44 <Simon> it shouldn't be setting old_price if this is a bulk_update
11:29:04 *** Mechtilde has quit IRC
11:29:11 <Simon> there would be no need for me to remove that code if the condition was checked before doing that
11:30:04 *** jralls is now known as jralls_afk
11:35:55 <chris> gnc-pricedb.c is a great candidate to sqlitize
11:37:16 <Simon> looking at the implementation of g_list_insert_sorted() it should only need to look at the head of the list if adding in reverse order
11:37:41 <jralls_afk> Simon, time64_cmp can't be inlined without -flto because it's a callback being passed to a function from libglib2.so.
11:50:30 <Simon> I don't really expect to see 44.5m calls to compare_prices_by_date for 44k prices
11:51:06 <Simon> all from g_list_insert_sorted
11:51:21 <Simon> I printed out the time for each of them and they look like they're loaded in reverse order
12:07:04 *** sbluhm has joined #gnucash
12:12:25 <Simon> I'm not sure the compiler is smart enough to optimise the parts of GncInt128 that frequently call get_flags() and then another function that also calls get_flags()
12:13:20 *** mydogsnameisrudy has joined #gnucash
12:13:23 <Simon> I'm seeing separate isNan() and isOverflow() calls but most of the time those are used together
12:19:31 *** AstronautSurfer has quit IRC
12:38:22 *** mydogsnameisrudy has quit IRC
12:38:47 *** David has quit IRC
12:49:50 *** David has joined #gnucash
12:54:50 <Simon> I think GncDateTimeImpl::GncDateTimeImpl(std::string str) is using the two regex in the wrong order
12:55:25 <Simon> if it was using the format used in the XML file it would save half the regex_match() calls
13:44:18 <Simon> even with only one of them, that regex_match() is about 10-13% of the total time
13:48:00 <Simon> I think the next biggest use of time is going to be boost::posix_time::time_from_string
13:53:07 *** r2d3 has joined #gnucash
14:08:28 *** Mechtilde has joined #gnucash
14:17:43 *** storyjesse has quit IRC
14:21:55 * Simon notes that all the price ordering has now been reversed
14:32:35 *** r2d3 has quit IRC
14:42:02 <Simon> oh. they're loaded in order which is already sorted
14:42:24 <Simon> so it has to go to the end of the list every time to append it, comparing it against every element
15:01:37 *** jervin has quit IRC
15:01:41 <Simon> I started converting it to a GPtrArray but the merge function is not trivial
15:01:51 *** kcin has joined #gnucash
15:02:42 <Simon> (at least, not trivial enough for me to quickly change function names)
15:18:14 <Simon> the SQL load of prices is even worse... it's unsorted coming out of the table
15:18:48 <Simon> which means that a generic "append instead of sorted insert" isn't going to work for everything
15:21:55 <Simon> https://github.com/Gnucash/gnucash/pull/1067 fixes the main issue with it, but the overhead to log nothing is worse
15:31:49 *** sbluhm has quit IRC
16:07:45 *** kcin has quit IRC
16:11:00 *** qwer has quit IRC
16:20:18 *** Bambuzel has quit IRC
16:22:17 *** Mechtilde has quit IRC
17:11:25 *** celeste has quit IRC
17:11:52 *** User has quit IRC
17:32:25 -jralls_afk- Simon, for bulk load it would make sense to always prepend to the list then sort at the end.
17:37:24 <Simon> I think it really needs to be an array, not a linked list
17:37:30 *** celeste has joined #gnucash
17:37:31 *** ChanServ sets mode: +v celeste
17:38:01 <Simon> for my use I'm appending because I know it's in the right order for that
17:38:35 <Simon> the API for gnc_price_list_insert is a bit awkward because the last argument is for checking duplicates, not bulk loading
17:38:41 *** o01eg has joined #gnucash
17:40:55 <jralls_afk> It can be changed internally but still needs to export lists. Scheme pretty much only understands lists; it's also the default container for a lot of Gtk.
17:41:42 <jralls_afk> But I wouldn't use GLib arrays. If you're going to do heavy work like that it makes more sense to convert the file to C++ and use std containers.
17:42:54 <jralls_afk> Don't append when building a list from scratch. Always prepend and reverse when done.
17:43:46 <Simon> yes but then I need to write code to iterate through the hash table and reverse all the lists
17:44:08 <Simon> whereas what I really need is to just load the data faster
17:44:43 <Simon> at some point I will change it to C++
17:46:26 *** mydogsnameisrudy has joined #gnucash
17:46:51 *** mydogsnameisrudy has quit IRC
17:47:01 *** mydogsnameisrudy has joined #gnucash
17:50:16 *** mydogsnameisrudy has quit IRC
17:56:59 <jralls_afk> The iterator code is easy-peasy. A one-line GHFunc wrapper for g_list_reverse, a similar one for g_hash_table_foreach that passes it, and a call to g_hash_table_foreach that passes the second GHFunc wrapper.
18:42:31 *** fell has quit IRC
18:44:56 *** fell has joined #gnucash
18:44:56 *** ChanServ sets mode: +o fell
19:11:45 *** Aussie_matt has joined #gnucash
20:02:04 <chris> Simon: you'll want to use a real email address in your commit authorship for proper GPL attribution
20:05:37 *** jralls_afk is now known as jralls
20:54:50 *** celeste has quit IRC
21:07:58 <chris> shoot forgot about test-new-owner-report
21:59:11 *** jervin has joined #gnucash
21:59:28 <chris> jralls: see my bug797596 merge commit-- created using --no-ff -- IMHO gitk shows a cleaner merge tree. Should we enforce it? If so then gitolite will need https://stackoverflow.com/questions/49284788/gitolite-how-to-prohibit-fast-forward-merging
22:01:02 <chris> actually I don't think this can be enforced by gitolite
22:13:25 *** jonp has joined #gnucash
22:15:03 *** celeste has joined #gnucash
22:15:03 *** ChanServ sets mode: +v celeste
22:15:33 *** Guest62 has quit IRC