Wednesday, March 21, 2012

Point in Time query >:[

Hey guys and gals,

I'm having a real problem with this query at the moment...
Basically I have to produce a query which will tell me the total number of people employed by the company at any given date and the total salary for all these people.

We have a people table and a career table.
People(unique_identifier, known_as_and_surname, start_date, termination_date ...)
Career(unique_identifier, parent_identifier, career_date, basic_pay ...)
Relationship people.unique_identifier = career.parent_identifier

Employees can be identified like so

SELECT *
FROM people
WHERE start_date <= DateSelected
AND (termination_date > DateSelected
OR termination_date IS NULL)

Passing the selected date to the query is no trouble at all I am just having problems with the point in time side of this.

All and any help is greatly appreciated :)
~George

P.S. SQL Server 2000 ;)...I am just having problems with the point in time side of this.could you elaborate on this a bit please?

because your query looks fine, all you need is an INNER JOIN as well as a GROUP BY and some aggregate expressions|||george ... is People to Career a 1:1 or 1:many relationship? What if a personn's salary changes during the date range in question?

If 1:1, then a count of unique identifiers and sum of the salary from the Career table using a join on a filter from the People table using the date criteria would do the job.

If however, you can have a salary change during the data range, and you have two rows in the Career table, you will need someone to define the business rules for that situation.|||could you elaborate on this a bit please?

I need to find out if they were an employee at any given date...
There is no career_date_from or to fields, just a single career_date - which is part of the problem!

Tom, One person can have many career history lines. and the problem is - how do I make this a range?

Here's some samlpe data that may help

unique_id name start_date termination_date
00001 George V 01/11/2006 NULL
00002 Tom 53 01/06/2004 01/06/2007
00003 Rudy 937 07/07/2007 NULL

unique_id parent_id career_date basic_pay
1 00001 01/11/2006 150
2 00001 01/12/2006 165
3 00002 01/06/2004 155
4 00003 07/07/2007 160
5 00003 09/07/2007 170

If I entered 02/11/2006 as my criteria I'd want to return the sum of the following lines

unique_id parent_id career_date basic_pay
1 00001 01/11/2006 150
3 00002 01/06/2004 155

Which gives us
2 employees : £305|||Do you just want the "last" (by career_date) record in Career where the career_date is less than or equal to the DateSelected?|||I think that's it!
I believe that makes it look something like this:

DECLARE @.SelectedDate datetime
SET @.SelectedDate = '2006-11-02'

SELECT Count(*)
,Sum(c.basic_pay)
FROM people e
LEFT JOIN career c
ON c.parent_identifier = e.unique_identifier
AND c.career_date = (
SELECT Max(career_date)
FROM career
WHERE parent_identifier = c.parent_identifier
AND career_date <= @.SelectedDate
)

That's what I couldn't get my head around :)|||george ... reference uniqueid '00002' ... you can't fire me ... I QUIT ;)|||george, your sample data was the key to understanding the data relationship (which was not at all apparent from post #1)

just another example of why we ask posters to show sample data :cool:

p.s. those unique_ids are awful!|||They are aweful, but you know what...
It allowed the original developers to make an inbuilt query designers that fools can use - which helps me a little. The only other benefit is that you know the relationships between almsot everything simply by logic.

But yes, it was like this when I got it :p

Can one of you kindly check the following code over once? It's my "final" result

DECLARE @.SelectedDate datetime
SET @.SelectedDate = '2005-11-01'

SELECT Count(*)
,Sum(c.basic_pay)
FROM people e
LEFT JOIN career c
ON c.parent_identifier = e.unique_identifier
AND c.career_date = (
SELECT Max(career_date)
FROM career
WHERE parent_identifier = c.parent_identifier
AND career_date <= @.SelectedDate
)
WHERE (e.termination_date > @.SelectedDate
OR e.termination_date IS NULL)
AND e.start_date <= @.SelectedDate

Oh and Tom, you were never fired... You just didn't turn up ;)

And Rudy; yes it occured to me that I never mentioned that it was 1:M...
It's Monday, I'm frazzled already!
I came in this morning and my monitor was covered in sticky notes because of missed calls etc. So lame.
Not a good start to the week.

Finally - thank you all :)|||Once again Poots' incisive logic cuts to the very core of the problem :cool:

Ok - this:
SET @.SelectedDate = '2005-11-01'
is not guarenteed to work in all system set ups. Better is:
SET @.SelectedDate = '20051101'
Also - is there a unique constriant on the composite key parent_identifier, career_date (assuming a person cannot have two career records in a day)? If not there could be two records for a person on a given day -> errors in the count and sum.|||SET @.SelectedDate = '2005-11-01'

This query will be translated into a 3rd party program that will only run on SS 2000 with it's own run time expression builder - so this was purely for testing purposes ;)
You can have two career records on one day... which is something I had not thought about. Would an order by clause sort this out (if say, I ordered it by date_entered)?|||Damnit - the dupes are causing me a problem.
There are around 10 people who are being counted twice!

How can I eliminate these?|||Extend the same logic again. You wanted the max(career_date). You now want the max(date_entered) for the max(career_date)...|||Yeah... Having trouble with that.

AND c.career_date =
(
SELECT Max(x.career_date)
FROM career x
WHERE x.parent_identifier = c.parent_identifier
AND x.career_date <= @.SelectedDate
AND x.created_by_user =
(
SELECT Max(created_by_user)
FROM career
WHERE parent_identifier = c.parent_identifier
AND career_date <= @.SelectedDate
)

Is not what I want... I'm sorry; lack of sleep + stress =
...can't even think of a word suitable to complete that sentence :( *sigh*|||Untested but worrabout:
... AND c.career_date = (
SELECT TOP 1 career_date
FROM MySchema.career
WHERE parent_identifier = c.parent_identifier
AND career_date <= @.SelectedDate
ORDER BY career_date DESC, created_by_user DESC
)|||SQL Server 2000 - no TOP *sigh*|||SQL Server 2000 - no TOP *sigh*Yeah - you mentioned that before. It is in SQL 2k.

What error do you get again?|||You now want the max(date_entered) for the max(career_date)...hmmm, sounds mysteriously like minimum price on earliest date (http://www.dbforums.com/showthread.php?t=1618384)

:cool:|||Sounds like it - but I can't use local views - courtesy of SS2K :o

However, I think I may have cracked it!
It's not pretty, but (I'm fairly sure :p) it works!

SELECT Count(*)
,Sum(c.basic_pay)
FROM people e
LEFT JOIN career c
ON c.parent_identifier = e.unique_identifier
AND c.career_date = (
SELECT max(c2.career_date)
FROM career c2
WHERE c2.parent_identifier = c.parent_identifier
AND c2.career_date <= @.SelectedDate
)
AND c.datetime_created =(
SELECT max(c3.datetime_created)
FROM career c3
WHERE c3.parent_identifier = c.parent_identifier
AND c3.career_date = c.career_date
)
AND (e.termination_date > @.SelectedDate
OR e.termination_date IS NULL)
AND e.start_date <= @.SelectedDate

What you think? :)|||Looks fine. Shame it takes three scans of the careers table :confused:

I think you should post the lack of TOP as a thread - it ain't right I tell ya it ain't right. Check it out in BoL - should be there.|||Yep - it's not exactly the fastest query in the west, but on this occasion I'm going to let it slide - because I actually can't come up with a better solution!

Why should the lack of TOP be a thread?
I suppose it is odd that it recognises TOP as a keyword (highlights it blue)

SELECT TOP 1 FROM people ORDER BY birth_date DESC
-----
Server: Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword 'FROM'.|||Why should the lack of TOP be a thread?
I suppose it is odd that it recognises TOP as a keyword (highlights it blue)

SELECT TOP 1 FROM people ORDER BY birth_date DESC
-----
Server: Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword 'FROM'.
Duhhhhhh! SELECT TOP 1... what?

SELECT TOP 1 *, myfield, 'George is a plonker' AS plonky_george
FROM people
ORDER BY birth_date DESC
-----
Server: Msg 156, Level 15, State 1, Line 1
Incorrect syntax near the keyword 'FROM'.|||More interestingly

AND c.datetime_created =(
SELECT max(c3.datetime_created)
FROM career c3
WHERE c3.parent_identifier = c.parent_identifier
AND c3.career_date = c.career_date
ORDER BY c3.career_date DESC
)

Gives me

Server: Msg 1033, Level 15, State 1, Line 21
The ORDER BY clause is invalid in views, inline functions, derived tables, and subqueries, unless TOP is also specified.

It's just toying with me!|||That is invalid and also the order by is superfluous there anyhoo.|||Apologies

SELECT TOP 1 birth_date FROM people ORDER BY birth_date
----
Server: Msg 170, Level 15, State 1, Line 1
Line 1: Incorrect syntax near '1'.

EDIT: How many times will I make the same mistake?|||Apologies

SELECT TOP birth_date FROM people ORDER BY birth_date
----
Server: Msg 207, Level 16, State 3, Line 1
Invalid column name 'TOP'.
Duhhhhhhhhhhh! ;)
SELECT TOP how many?|||this time you forgot the top how many

come on, george, slow down and do some desk checking (an age-old debugging technique where you actually read what you just wrote to see if it makes sense)

:)|||See above (edit) :D

My edit beat your posts ;)

And yes, sorry :(

And another smiley for luck :cool:

It's been a long hectic day :o|||Ok - so can we confirm that the below query and error go together?

SELECT TOP 1 birth_date FROM people ORDER BY birth_date
----
Server: Msg 170, Level 15, State 1, Line 1
Line 1: Incorrect syntax near '1'.

If so George - start a thread. This is some sort of error.|||Yes, that is correct.
Damnit.
*goes off to start a new thread* :rolleyes:

Thank you for your patience - and I'm glad you had fun when I misposted ;)

SELECT TOP 1 *, myfield, 'George is a plonker' AS plonky_george
FROM people
ORDER BY birth_date DESC|||I'm glad you had fun when I misposted ;)When you are as easily amused as me every day is a riot ;)

No comments:

Post a Comment