On 11/1/2010 1:42 PM, matt wrote:
> Hi all,
>
Hi Matt,
> I'm struggling with a concept here, and I'd like to pick some brains.
> I've got a series of report pages with complicated queries that return
> row counts in the 10s of thousands. These are paginated, and after
> accepting search and sort criteria from the user, I'm displaying a
> header that looks like this:
>
> Search contains 29,657 records. Records per page [select 25, 50, 100]
> Page [1] 2 3 ... 1186 1187 Next>>
>
> To get record totals and then deduce page numbers, I'm pulling a full
> recordset from MySQL and using PHP to display the appropriate rows.
> The problem is that, in the example above, at least, that query takes
> about 70 seconds to run. If I were to limit the search criteria so
> that it only returns<100 rows, then it's lightning fast. We're using
> ndb-cluster, so JOINs really have a performance impact.
>
Clear (and common) problem.
> This leads me to believe I could take advantage of LIMIT, however, I
> don't see how I'd get my total row count anymore.
That aside: if you rerun your whole query every invocation of that page,
then you will probably still need the 70 seconds to produce the result
(at the database), and only give back 10 to PHP.
>
> Usually these reports are one-to-one with a particular table, so I've
> entertained the idea of running a separate COUNT query against that
> table with no JOINs. This won't suit my purposes because in a lot of
> cases, the search criteria are against fields in JOINed tables.
>
ok.
> I've also thought about using temporary tables that are periodically
> updated (think Oracle's static views.) The problem there is that a
> lot of times, our users will go directly from a data entry page to a
> report and expect it to reflect the new data that was just entered.
> If I use these "snapshot" tables, I would also need to implement a
> mechanism that refreshed them whenever data is inserted/updated/
> deleted elsewhere.
>
> Finally, the last approach I've considered is what I call the
> "PeopleSoft" method. That is, if you want a report, you enter all of
> your criteria, choose a format (XLS, HTML, etc.) and the report goes
> into a queue. You don't wait for your results, but instead go to a
> process management page, which would likely be AJAX enabled to inform
> you when the report was built and ready for viewing. I think this
> would be the least convenient solution for our users.
>
> I'm just curious to see how other people have addressed this problem
> and what you've found to be the up/down sides of your chosen solution.
>
The way I approach this:
1) Run the query once, and store the ID's of something that will enable
you to fetch rest of the tuple based on that number.
(For example: store a userid, and you know you can easily find the rest
of the accompanying information based on a userid)
2) Store the id's in the session.
3) write some (basic)code that will produce the pagination and number of
results per page.
If your visitors click >>next>> or something, you can use the userids
you stored in the session to fetch the needed results from the database.
You will use something like:
SELECT userid, username FROM tbluser WHERE (userid IN (xx,xx,xx,xx,xx));
Where the xx are userids you pull from the session.
That way you will only fetch the rows you need from the backend (database).
Will such a query be slow?
(If the answer to that is YES, then this approach won't work, but in
most cases I encountered this worked because of the eliminated complex
table scans for each search.)
Drawback:
- If you have a really huge resultset, your session will grow too of
course. But I consider that a lesser problem than repetitive needless
queries to the database from which only a few will be shown.
> And, yes, the DBA is actively investigating why ndb is running much
> slower than the same queries on InnoDB or MyISAM. That's a separate
> post. For the purposes of this discussion, I'm interested in software
> solutions.
>
> Thanks,
> Matt
Good luck.
Regards,
Erwin Moller
--
"There are two ways of constructing a software design: One way is to
make it so simple that there are obviously no deficiencies, and the
other way is to make it so complicated that there are no obvious
deficiencies. The first method is far more difficult."
-- C.A.R. Hoare
|