Hi All:
I haven't been able to find an answer to this question in the forums, so I thought I'd ask it myself.
I have attached a spreadsheet from the U.S. Census bureau, which reflects the population of Los Angeles County in 1990 and in 2000, by Census Tract. Unfortunately for my purposes, the Census split many census tracts between 1990 and 2000, resulting in 397 additional tracts in 2000. So for example, 1990 Tract 1011 became year 2000 Tracts 1011.10 and 1011.20.
I'm trying to make a GIS map of the data, but obviously direct comparison between the two Censuses is impossible with the data configured in this way.
How can I merge the 2000 tracts back into their 1990 counterparts? Brief notes:
1) Initially, I simply took out the four digits starting six from the left, using the MID function. Then I used SUMIF.
2) Unfortunately, this created duplicate records for every split tract, which I had to manually delete. I never got every last one because there are so many records and I don't have a good eye.
3) The worse problem is that SUMIF also merged previously split tracts (i.e. tracts that were split between 1980 and 1990), as their MID string was identical to what it would have been if the split occurred between 1990 and 2000.
Is this clear? Anyway, my question remains: How can I use a formula to merge the 2000 tracts back into their 1990 counterparts?
Thanks very much.
Steven Simon
Spreadsheet attached here. Sorry.
not clear at all! what do you want where i have no idea what a tract is.
but if you show before and what you want after and show those bits you dont want touched i.e where it went wrong for you
then ........maybe
"Unless otherwise stated all my comments are directed at OP"
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
Sorry about not being clear. You get immersed in your own data and it's hard to bring yourself out.
See the new file. There are two sheets: "Original" is the one I first posted, for reference. "Examples" is an example of what I want "after."
So for example:
In Worksheet "Original," A3 to E3 show 1990 Census data for Tract 1011. A specific Census Tract - it's a defined geography maybe the size of a neighborhood. A3 through D3 are just geographic identifying info. If you look close you'll see there are identical strings of numbers in all of them, identifying the tract as 1011. E3 shows the number of people living in Tract 1011.
Fast-forward to the 2000 Census and Tract 1011 has grown so much that they are splitting the same geography into two parts: 1011.10 and 1011.20. You can see this in Worksheet "Original," F3 to J3 and F4 to J4. I need to merge those back into one tract, as they were in 1990. So look to Worksheet "Examples," and you will see in J3 that I have merged the populations of 1011.10 and 1011.20 into one tract, the old 1011. This is what I want to do to the whole sheet: merge tracts in the 397 instances where they were split for the 2000 Census. Make a similar comparison between the two sheets for Tract 1012 if you'd like another example.
BUT: there are also some instances where a tract was already split into two by 1990, and there is no change in 2000. See for example Worksheet "Examples," A5 through J5 and A6 through J6. Both rows reflect a tract that was once called 1310, but was split before 1990 by the U.S. Census Bureau. I want to keep that. Using a simple SUMIF, I wasn't able to.
So I'm looking for a formula or nested formula that will allow me to merge tracts that were split in 2000, but not merge tracts that weren't split before 1990.
I hope this makes more sense. Sheet is attached.
well ive been at this for 2 hours trouble is it isnt consistant some incremented changing from ending in from 00 to 10,20... some to 01,02,03.....
ther are at least two who were 01 and 02 in 1990 but they have disappeared in 2000 becoming 03,04,05,06. thought i cracked it but ended up with duplicates
basically matching whats in 1990 with 2000 will give a list of unchanged
then weeding out the odd ones is next, then account for incremental changes in 2000
is this a full list for LA, have you got to repeat this for other cities?
i'm off to bed it's nearly 01:00 here and i have to be at work by :08:00 lol
Last edited by martindwilson; 06-18-2009 at 07:59 PM.
"Unless otherwise stated all my comments are directed at OP"
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
Does this help? The data that are aligned match.
Microsoft MVP - Excel
Entia non sunt multiplicanda sine necessitate
we really need to know if this is a one off. as SHG has presented i have now highlighted where i dont know what to do
eg
14000US06037910701
14000US06037910702
on 1990
is now shown as
14000US06037910703
14000US06037910704
14000US06037910705
14000US06037910706
14000US06037910707
14000US06037910708
14000US06037910709
14000US06037910710
in 2000 the originals have gone how should the new ones be divided up and allocated to the 2 originals?
"Unless otherwise stated all my comments are directed at OP"
Mojito connoisseur and a dabbler in Cisco
where does code go ?
look here
how to insert code
how to enter array formula
why use -- in sumproduct
recommended reading
wiki Mojito
how to say no convincingly
most important thing you need
Martin Wilson: SPV
and RSMBC
My assumption is that the 14000US06037910701 and 14000US06037910702 are contained within the new 14000US060379107xx tracts. I will compare maps when I get home and respond.
Thanks so much to both of you.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks