# Solved: SQL: how to use GROUP BY with UNION?



## Jimmy the Hand

Hi All,

The outline of my problem is this. In an analytical laboratory there's a database to store samples, tests and results. I need to retrieve the average time (called "transit time") that was needed to complete each individual tests over a year, and display this average value for each year, separately.
Individual transit times are calculated as the difference between the time of test completion and time of sample login. The current year is always the year of sample login, even if test completion is postponed until next year.

The twist is that there are two sets of DB tables, an active (sample, test) and an archived (c_sample, c_test) set. Their definitions are identical, but one stores recent data, and is used everyday, the other stores old data, and is used rarely. I need data from both, at the same time.

I figured, what I needed to do was making two selects, grouping them by the year value, and then making a union of them. Here's what I did:



Code:


SELECT avg(test.date_completed-sample.login_date) as transit, to_char(to_date(sample.login_date), 'YYYY') as wYear
FROM sample JOIN test
ON sample.id_numeric = test.sample
GROUP BY wYear

UNION

SELECT avg(c_test.date_completed-c_sample.login_date) as transit, to_char(to_date(c_sample.login_date), 'YYYY') as wYear
FROM c_sample JOIN c_test
ON c_sample.id_numeric = c_test.sample
GROUP BY wYear

In my best knowledge the above SELECT should work, but it doesn't. Anyone knows why? The error message is:
*ORA - 00904: "WYEAR": invalid identifier*
Remark: The query does work without the GROUP BY clauses. But, of course, it's not what I need.

Thanks,

Jimmy

PS: 
*The same problem (with almost identical description) has been cross-posted on VBAX forum:
http://www.vbaexpress.com/forum/showthread.php?t=11264*


----------



## Chicon

I think that *GROUP BY* doesn't accept aliases in Oracle.

Therefore :



Code:


[SIZE=2]SELECT avg(test.date_completed-sample.login_date) as transit, to_char(to_date(sample.login_date), 'YYYY') as wYear
FROM sample JOIN test
ON sample.id_numeric = test.sample
GROUP BY to_char(to_date(sample.login_date), 'YYYY')

UNION

SELECT avg(c_test.date_completed-c_sample.login_date) as transit, to_char(to_date(c_sample.login_date), 'YYYY') as wYear
FROM c_sample JOIN c_test
ON c_sample.id_numeric = c_test.sample
GROUP BY to_char(to_date(c_sample.login_date), 'YYYY')[/SIZE]

More examples on this page : http://www.psoug.org/reference/group_by.html


----------



## Jimmy the Hand

Thanks for the help, Chicon, your solution works fine, and the explanation is welcome.
However, running the query results in double occurrences of years. That is, if there are samples of a particular year in both the active and archived table set, then they appear separately, whereas I'd like them together.

E.g.
2002 6,59833026226381 
2003 5,51757685358263 
2004 5,98794781660715 
*2005 7,18336517489712 
2005 2,31719714506173 
*2006 4,42310144857702 
2007 3,38955316961515

I'd like to calculate the average over the unified records of active and archived table sets.
I'm starting to realize that it will require a completely different approach...
Right?


----------



## Aftab

I don't know ORACLE, but this is how I would do it in T-SQL.

SELECT AVG(date_completed-login_date),to_char(to_date(login_date), 'YYYY') as wYear FROM
(
SELECT test.date_completed 'date_completed',sample.login_date 'login_date')
FROM sample JOIN test
ON sample.id_numeric = test.sample

UNION ALL

SELECT c_test.date_completed 'date_completed',c_sample.login_date 'login_date'
FROM c_sample JOIN c_test
ON c_sample.id_numeric = c_test.sample
) AS tbl

GROUP BY to_char(to_date(login_date), 'YYYY')

You combine the tables into one in the inner SELECT. The ALL after UNION ensures duplicate rows are NOT discarded, you will need that to get accurate figures. Then you perform the GROUPING once on the entire set.


----------



## Jimmy the Hand

Thank you Aftab, 
I'm going to put your idea to the test tomorrow and then come back with the result.


----------



## Jimmy the Hand

Aftab,
Your solution had one small flaw. The alias of the inner SELECT ("AS tbl") gave error. When I removed it, the method you drew up worked great. Even so, I had to modify the query considerably, because it had logical errors (as a consequence of bad planning), and the returned results were wrong. But in the end, all worked out fine. Thanks for your help.

Jimmy


----------

