Ticket #79 (closed enhancement: fixed)

Opened 4 years ago

Last modified 6 months ago

Picking database columns for QQuery

Reported by: alex94040 Owned by: somebody
Priority: important Milestone: 2.1
Component: QQuery Version:
Keywords: qquery Cc:

Description (last modified by alex94040) (diff)

We need the ability to say which columns we'd like to SELECT in a QQuery statement. Otherwise, the performance of the queries generated by it is very sub-optimal. Lack of this ability, for example, kills the concept of a "covering index".

Proposed syntax: A query that only grabs and fills in the Id and Name variables of the User object array

$query->AddClause(QQ::SelectColumn(QQN::User()->Id, QQN::User()->Name));

Attachments

SelectColumns.patch Download (4.0 KB) - added by alex94040 3 years ago.
Early (and lightly tested) patch for the issue - run sample.php
ticket79.patch Download (13.5 KB) - added by vakopian 21 months ago.
ticket79-v2.patch Download (13.2 KB) - added by vakopian 21 months ago.
ooops, wrong patch. Here is the correct one
ticket79-v3.patch Download (22.4 KB) - added by vakopian 7 months ago.
A new patch, with a full solution that works for normal queries as well as Expand and ExpandAsArray?. Contains a full example. Check out the example for usage

Change History

Changed 4 years ago by VexedPanda

This is mentioned in:
<http://trac.qcu.be/projects/qcubed/ticket/5>
I'll leave this ticket open as a child of that.

Changed 3 years ago by basilieus

Innodb is a clustered index so it already uses a type of "covered index" because the primary key is attached to all child keys.

Changed 3 years ago by VexedPanda

Yes, but if I had a clustered index on PK, FK1, C5, while C1,C2,C3 and C4 weren't under it, and I only cared about the contents of PK, FK1, and C5, that clustered index would currently be useless, since QQ would do a select *, and not a select PK, FK1, C5.

Changed 3 years ago by basilieus

Doesn't it also have to do with the WHERE clause and how you use it? Such as the = operator?

Changed 3 years ago by VexedPanda

Sorry, I've been using the wrong terminology. I am referring to covering composite indexes, not clustered.

While you gain speed improvement by limiting where/having/group by/order by clauses to just columns covered by a single index, you can gain even more when you also only select columns covered by that index (making that index cover the query).

To quote  http://www.mssqlcity.com/Tips/tipInd.htm:
A covering index is an index, which includes all of the columns referenced in the query. So the creating covering index can improve performance because all the data for the query is contained within the index itself and only the index pages, not the data pages, will be used to retrieve the data. Covering indexes can bring a lot of performance to a query, because it can save a huge amount of I/O operations.

Changed 3 years ago by alex94040

  • keywords qquery added
  • component changed from Codegen to QQuery
  • description modified (diff)

Changed 3 years ago by alex94040

Early (and lightly tested) patch for the issue - run sample.php

Changed 3 years ago by alex94040

  • status changed from new to in_QA

Experimental patch for the issue - tested with only the simplest scenarios; please review. Also, please test with complex queries (joins, expand-as-array, etc).

Changed 3 years ago by VexedPanda

Excelent. I've been looking forward to this ticket getting some work. I took a look at the code, and this may work. Luckily the InstantiateDbRow? code already checks for the existance of a column in the result, so there were less changes needed than I expected. It definately needs stringent testing though.

I'd really like to see an approach that prevents the AddSelectItem? call rather than adding-then-removing the column though.

Changed 3 years ago by alex94040

Preventing AddSelectItems would require GetSelectFields() in the ORM classes to know something about the clauses of the query that's being executed; I don't think it's worth it to have such a major overhaul to get this minor performance improvmeent.

Changed 3 years ago by VexedPanda

My original thoughts were that we would not try and treat it as a clause, but rather as a 3rd parameter to the load functions (in addition to conditions and clauses). That would still require changes to the template, but may be manageable enough to be worth doing for the performance and clear code-path (maintainability).

Changed 3 years ago by alex94040

I'm not convinced that performance implications will be at all noticeable here - we're talking about adding a few items to the array, and then removing them. I certainly don't want to completely re-implement this unless there's a really good reason to. If you feel passionate about it, I encourage you to try that alternate approach.

Let's focus on testing the crap out of this baby to see if there are actual semantic issues here (joins, expand-as-array, etc)

Changed 3 years ago by basilieus

Alright I did some basic testing and it looks like it doesn't work as intended when you expand a table and want only something from that table.

So if you expanded addresss and only wanted a certain column from address you wouldn't get it, what it does is just put all of the columns from the address table in the query.

I also would make note that when I put QQ::Expand() after the SelectColumn?() I had issues displaying all of the "Select Fields" via debugging. After I moved my expand in front of SelectColumn? I could see the other select fields.

Changed 3 years ago by VexedPanda

Ok, whatever we end up with should work with all the QQ functionality.

Anyone care to make a list of use-cases?
Here's a start:
1) Load all people, with just ID and name
2) Load all people, with just their id, name, and login username
3) Load all people with usernames begining with b, with just their id and name

Changed 3 years ago by MatthiaZ

I don't know if this is already possible but if it's not it would have to be possible to also add calculated virtual columns too.

E.g.: select id, firstname,lastname, concat(firstname,' ', lastname) as fullname from people

Or would that be a different issue?

Changed 3 years ago by VexedPanda

As far as virtual columns are already supported, I think this should support them too. Adding new types of virtual columns, or beefing up support for them would be a seperate ticket though.

Changed 3 years ago by alex94040

  • milestone changed from 1.1 to 1.2

Changed 3 years ago by VexedPanda

Too bad, this is a big enough change, and with an existing patch. I'd have loved to see it in 1.1, to further impress. :)

Changed 3 years ago by alex94040

Yeah, I know... It's just that the patch doesn't quite work well, and I don't want to rush such an important thing (because of the risk). We can prepare for this by having great unit test coverage for QQuery.

Changed 3 years ago by VexedPanda

  • milestone changed from 1.2 to 2.0

Changed 2 years ago by VexedPanda

Another possible approach was described here:
 http://www.qcodo.com/issues/view.php/42

Changed 2 years ago by alex94040

  • milestone changed from 2.0 to 2.1

Changed 2 years ago by VexedPanda

There's some really good discussion around how to treat this in  http://www.qcodo.com/issues/view.php/42/ right now

Changed 22 months ago by VexedPanda

  • milestone changed from 2.0.2 Stable to 2.1.0

Changed 21 months ago by vakopian

I propose a simple change to Alex's patch, so that it doesn't need to remove select items, but rather avoid adding all columns. Here is how. We make a very simple change to the codegen, so that instead of unconditionally calling GetSelectFields($objQueryBuilder) from BuildQueryStatement, it actually checks if any of the optional clauses is a QQSelectColumn:

$blnAddAllFieldsToSelect = true;
if ($objOptionalClauses instanceof QQSelectColumn) {
	$blnAddAllFieldsToSelect = false;
} else if (is_array($objOptionalClauses)) {
	foreach ($objOptionalClauses as $objClause) {
		if ($objClause instanceof QQSelectColumn) {
			$blnAddAllFieldsToSelect = false;
			break;
		}
	}
}

if ($blnAddAllFieldsToSelect) {
	Project::GetSelectFields($objQueryBuilder);
}

Then we can do:

User::LoadAll(QQ::SelectColumn(QQN::User()->Name));

to only load the names of the users.
With this, and by changing Alex's patch to add select items instead of removing them, I think we can get to the desired goal.

If this suggestion is ok I can create a new patch.

Changed 21 months ago by mcluver

Have you tried testing this yet?

Changed 21 months ago by VexedPanda

We're going to need very robust test cases around all this, and I'd even suggest they get written first, to solidify the desired syntax of this whole thing, and ensure that any patch we're evaluating gets solidly covered, including expands / conditional expands / expand as arrays / group bys / etc.

That said, this strikes me as a good approach, and I'd love to see it get into testing. :)

Changed 21 months ago by alex94040

This sounds like a good approach, Vakopian. Can you put together a new patch?

Changed 21 months ago by alex94040

  • milestone changed from 2.1.0 to 2.0.3

Changed 21 months ago by VexedPanda

Hmm, not sure I see this being a match for 2.0.x (which I see as more of as maintenance releases at this point). This is definitely a feature that would be worth advertising as part of a minor upgrade version. Preferably along with subsql QQ support (look at what fernando has done over at QCodo).

Changed 21 months ago by alex94040

  • milestone changed from 2.0.3 to 2.1.0

Makes sense. 2.1 it is.

Changed 21 months ago by vakopian

Well, this turned out to be more difficult than I expected - mainly the part about "add select items instead of removing them". We can say a lot of good things about Mike Ho, but the man loooooves repeating code.
I ended up refactoring a lot of the QQNode code, to get all the bits and pieces needed for finally calling $objBuilder->AddSelectItem(). The resulting patch is attached. I know it's a large patch, but it's mostly refactorings (and even removing large blocks of repeated code).
And as promised, it's fully backward compatible. In fact the existing unit tests behaved the same on my box before and after the patch.
So, like Vexed said, let's write lots of unit tests, and test it in as many ways as possible.

Changed 21 months ago by vakopian

Changed 21 months ago by vakopian

ooops, wrong patch. Here is the correct one

Changed 20 months ago by VexedPanda

Can you validate that the generated SQL also hasn't changed? In theory, this looks really good though. :) I hope to have more time to check it out and test it next week.

Changed 20 months ago by vakopian

I wish there was an easy way to write unit tests for checking the generated SQL without running it. I don't think QQ has a public API to just get the SQL. As far as I can tell, the method that composes the SQL is in the generated files, it's BuildQueryStatement?() and it's protected. In fact for my own testing/debugging I had made this method public so I could easily generate SQL (without running it and looking at the logs). Do you know a better way to achieve the same thing? Perhaps we should think about making BuildQueryStatement? public for real?

Changed 20 months ago by alex94040

I think it's totally fine to make BuildQueryStatement? public, as long as there's a big comment on top of it that says "DO NOT CALL THIS METHOD in your application. It is made public exclusively for unit testing purposes, to allow the test framework to verify the SQL generated by QQuery."

Changed 7 months ago by vakopian

(In [1305]) Cleanup and refactoring. Refs #79

Changed 7 months ago by vakopian

A new patch, with a full solution that works for normal queries as well as Expand and ExpandAsArray?. Contains a full example. Check out the example for usage

Changed 6 months ago by alex94040

  • milestone changed from 2.1.0 to 2.0.3

WOW, this is AMAZING work!!! Can't wait to play with it myself.

Changed 6 months ago by alex94040

I'd love to add unit tests as a part of this checkin.

Changed 6 months ago by vakopian

  • status changed from in_QA to closed
  • resolution set to fixed

(In [1317]) reviewed by alex. added unit tests. Fixes #79

Changed 6 months ago by alex94040

Outstanding work!!! Thanks so much!

Note: See TracTickets for help on using tickets.