On the UK-DAC I’ve noticed that select count(*) from dp02_dc2_catalogs.Object
queries are slow via our portal but fast via our notebook. On the US RSP they are fast both ways.
We have statistics on our qserv tables so I’m assuming that the difference is that the portal adds a limit clause. So there is either something different in our qserv or RSP that is not parsing/removing this.
We’re a few releases behind on qserv but our RSP is up to date so leaning towards the difference being in qserv but wondered if anyone knows?
I had a serch of various github repos but couldn’t readiliy find anything.
Thanks
Mike
What version of Qserv are you running?
Yes, LIMIT <N>
is what’s ruining the optimization. I will open a JIRA ticket to address this issue. The result of the query shouldn’t be affected by the presence of LIMIT <N>
.
It’s not a version-specific problem. It’s in all versions of Qserv since the optimization was introduced ~3 years ago.
Think we’re on 2024.5.1-rc7 on our prod cluster.
Thanks both.
Though I don’t understand why the US-RSP portal query is quick? That has the limit condition in AFAICT
I’m curious, how faster are queries submitted via the US-RSP portal compared to yours?
I have registered the following JIRA ticket to address the issue: https://rubinobs.atlassian.net/browse/DM-51585
On the US-RSP the portal counts query returns in around 5 seconds. This seemed like it’s using the precalculated stats. If I add a WHERE clause it takes a bit longer 30s? Maybe it’s just that you’re qserv is that much quicker.
On ours it takes tens of minutes and I can see it querying each chunk, i.e. actually doing a count.
The optimized queries should be returned within less than a second. 5 seconds means scan, which is very fast if WHERE
is not present. 30 seconds is a reasonable performance if some light-weight WHERE
is present. Qserv needs to be fixed to address this problem.
Hi @gapon , going through the RSP though I presume there are a few seconds associated with RSP/TAP overheads and the count query itself is only taking a couple of seconds at the qserv end? If that is doing a scan, counting rows in each chunk/worker? and aggreating the results that’s very impressive! And a thousand times or more quicker than our (HDD prod) system. Hopefully when we get our new NVME based prod system up and running we’ll get closer to this performance.
As a further test I did a US-RSP portal simple count query on ForcedSource, so the limit clause is present. this returns the expected 111 billion row count in just a few seconds, how is this so quick? I was expecting it to take a few hundred seconds. I think I’m missing something still, maybe the newer qserv is more optimized somehow for full simple counts without using the stats?
No “magic” here. The unconstrained row counting in the actual query execution mode (if no row counter statistics are available to Qserv Czar) is quick because we rely on the MyISAM table engine. The engine quickly reports the number of rows in each chunk table. Counters from all chunks are collected in parallel by all workers using available threads. On top of that, there is an obvious Qserv overhead for parsing/analyzing the query, sending thousands of requests to workers, aggregating counters into the result table, etc. It all takes one or 2 seconds, or less.
If we were to use a different storage engine, such as InnoDB
, then things would look very different. Simple COUNT(*)
queries against the InnoDB
tables require a real scan of all entries in the table’s B-Tree. Such an operation would take many seconds for the large chunk tables.
However, when you do a real scan that involves a constraint, then MySQL would have to read the whole table (or a part of it, depending on the query and availability of the indexes):
SELECT ... FROM ... WHERE something = ...
Such a query would take a while (may be hours) on a table of 111 billion rows.
Thanks @gapon , so simple count queries even without row counter stats should be quick because the myISAM engine isn’t having to actually read the table/rows? Yet on our qserv it takes an hour on the Object table with the LIMIT clause in place to avoid using the counters, it feels like we’re missing something on our set-up that’s not just disk I/O.
As a test I submitted count queries to all Object chunks on all workers directly bypassing the Czar. That took only a second or two to aggregate all the results and they weren’t even run in parallel.
I’ll do some more tests when back from leave to try and understand what’s going on.
Cheers
Mike
Hi Mike! Yes, the simple row-counting query should be quick in MyISAM. Most likely, you’ve encountered a known issue with the older Qserv design, where the entire chunk table is preloaded into the memory of the worker node before Qserv runs the scans against the chunk. And this is done for each chunk, regardless of its size (or table type). The mechanism was found to significantly affect the performance of most scan queries (except for a few corner cases). After I identified this issue, I eliminated this from Qserv. It was done in the following release https://github.com/lsst/qserv/releases/tag/2025.5.1-rc1.
Thanks @gapon that’s put my mind at rest and I can enjoy my holiday now ! I’d read something on changes to memory loading but hadn’t appreciated how it would affect certain things. Our new cluster will have an up-to-date qserv so I’ll be able to confirm this when it’s rolled out.