PeopleFinderTechStructuredDataSets
From Katrina Help Info
| Table of contents |
Red Cross (ICRC)
http://www.familylinks.icrc.org/katrina
Records: 134,000+
Status: Scraped and verified against schema. Working on splitting into multiple smaller files to upload.
Contact: op_prot_eur.gva [at] icrc.org
Notes:
- I have sent e-mail to this contact address -- JamesDennett
- Since Bill has run out of time to work on this scraping project I'll take a crack at it. I can be contacted via email at brent [at] bjohnson.net
- I have modified the scraper to handle deltas and to parse first and last names. I ran the scraper last night and collected 94320 records. Unfortunately it hit an unexpected page and stopped. I'm rerunning the scrape and picking up the names that have been added since the last scrape and the ones after it stopped. --Bljohnson 14:13, 7 Sep 2005 (EDT)
- Scrape completed -- Brent
Gulf Coast News Survivor Connector
http://wx.gulfcoastnews.com/katrina/status.aspx
Records: 58,931
Status: Validated.
Contact: devin at nacredata.com
Contact: ken [at] gulfcoastnews.com
Notes: Emailed requesting participation on Tues 9/6 3:19pm
Katrina Data Project
http://www.katrinadataproject.com/index.aspx
Records: 33,743
Status: David G. has talked with them, and they're willing to work with us. We need someone to help them with PFIF.
MSNBC "Looking for" and "Safe" lists
http://www.msnbc.msn.com/id/9159961/ (Looking for)
http://www.msnbc.msn.com/id/9159954/ (Safe)
http://www.msnbc.msn.com/apps/connect/search.aspx/ (Search form)
Records: 150,000+
Status: Currently scraping.
Contact:
Notes:
- I've started scraping the "Looking for" list now.
Public People Locator
http://www.publicpeoplelocator.com/
Records: 32,755
Status: some duplication in data, but fairly complete and orderly.
Contact: katrinapeoplefinder [at] yahoo.com
Notes: Emailed for participation on Tuesday 9/6 11:13pm PST :: AaronPava 02:15, 7 Sep 2005 (EDT)
Family Messages
http://www.familymessages.org/index.php
Records: 15,922
Status: Willing to help. Developer is on katrina dev mailinglist.
Contact: chaney [at ] dcre-labs.com
Note: Implementing PFIF. Test feed is at http://familymessages.yahoo.net/new/pfif.php?page=1
Emailed for status Tuesday 9/6 2:23PM PST
Response: (2:31 PST) Thanks Aaron, an extra set of eyes is always helpful.
Please take a look at (removed for privacy) for the "in test" examples. We have feedback concerning the UTC timestamp, xml headers and the need fpr a pfif: tag at the beginning. That is being worked now but if you see anything else amiss, please don't hesitate.
-dan
Katrina Safe
http://www.katrinasafe.com/WebEntryApplication/searchform.aspx
Records: Maybe 5,000 to 20,000 (There are 95 Smiths, and Smiths are about 1% of the US population.)
Status: not browsable; will need cooperation from site
Contact: katrinasafeweb [at] hotmail.com
Notes: Emailed for participation on Tuesday 9/6 11:16pm PST :: AaronPava 02:19, 7 Sep 2005 (EDT)
Hurricane Katrina Survivor Registry
http://www.katrina-survivor.com/
Records: 13,987
Status: not browsable; will need cooperation from site.
i think this is browsable. try searching for '%':
http://www.katrina-survivor.com/searchbyname.php?FirstName=%25&MiddleName=&LastName=
i'm starting to scrape it now ZBerke 01:44AM, 8 Sep 2005 (PST)
Contact: gtg944q [at] mail.gatech.edu (Justin Harper)
Notes: Emailed for participation on Tuesday 9/6 11:20pm :: AaronPava 02:21, 7 Sep 2005 (EDT)
LANH Katrina Evacuee Directory
http://www.lnha.org/katrina/default.asp
Records: 4,500 (roughly)
Status: Added to list Sept 7, 10:30PM PST Appears on one page in entirety: search by last name '%'.
Contact: info[at]lnha.org (has not been contacted)
Katrina Finder
Records: 4,223
Status: Helping out. Implementing RSS spec. Developers on mailinglist.
Contact: dan[at]katrinafinder.us
Notes: emailed for status Tues 9/6 2:30 PST
Katrina Tracker
Records: 3,052
Status: Up for helping, developer is not on katrina dev mailinglist though.
Contact: help[at]katrinatracker.com (Paul)
Notes: emailed for status Tues 9/6 2:34 PST
I contacted Paul on 9/13. He is going to send me example data to get started. - Geoff Webb
Response: spoke at 3:25 by phone. Open to participate. Will get on mailing list.
Hurricane Help
http://katrina.earthlink.net/people/list
Records: 2,925
Status: Helping out. Implementing RSS spec. Developers on mailinglist.
Contact: holland3 [at] corp.earthlink.net
Note: Implementing PFIF feed
Emailed for status Tuesday 9/6 2:26PM PST
Houma Shelters
Records: 2,800
Status: The official webiste for evacuee shelters in the Houma/Terrebonne Parish area. I (the webmaster) am working on a PFIF export.
Contact: webmaster [at] houmashelters.com / matthew [at] phusikos.com
Notes: emailed for status at 2:15 PST 9/6
'Response: AaronPava 16:37, 8 Sep 2005 (EDT)
Hi Aaron,
Sorry for not getting back to you earlier. I just got back to work yesterday and my plate is all-too-full. It would be great if you or other developers could assist. I began work on it, but I'm just not sure if I'll have the time to finish up. I can send you the work-in- progress, the database schema we're using, and any other information that might be of help.
Thank you, Matthew
Hurricane Katrina Persons DB
http://connect.castpost.com/fulllist.php
Records: 2,290
Status: Scraped,
Contact: katrina [at] castpost.com
Notes: Emailed for participation on Tuesday 9/6 11:23pm PST :: AaronPava 02:24, 7 Sep 2005 (EDT)
Validation: Attempted to validate XML file against http://www.w3.org/2001/03/webdata/xsv and recieved the following error:
The following tags were not closed: xsvHardFault. Error processing resource 'http://www.w3.org/2001/03/webdata/xsv?docAddrs...
Find Katrina
Records: 2,580, but fewer after garbage is cleaned up
Status: first attempt posted at [findkatrina.rss (http://www.dwiggins.net/katrinadev/findkatrina.rss)]
Contact: alexkehr [at] mac.com
(I have sent e-mail to this contact address -- JamesDennett (jdennett).)
Notes: Sent email to see about participation on Tues 9/6 3:57pm PST
Scraping Update: Open questions about this feed:
- Is it ok to leave the source date field blank if the original repository doesn't provide an entry date for the record?
- The original source just has a combined "contact" field, rather than a seperate field for phone and e-mail. Briefly considered trying to grok out e-mail addresses, etc. But this would be pretty unreliable given the dirtiness of the data. So I just duplicated the data into both, figuring in this case it would improve searchability. Is this the right thing to do?
- Similarly, the original source does not seperate first and last names. So people have done them in all sorts of combinations. Rather than trying to interpret this and risk getting it wrong, I just put the whole string in last name. Is this ok?
- Ditto on address --- there is just one "lives in" field, so the data is pretty dirty. Just dumping it all in home_city. In addition, I am attempting to grok out the state (based on multiple choice of AL, LA, or MS) and fill it in automatically. If I can't match it to one of these states I'm leaving this field blank.
Appreciate any feedback! --Dmdwiggi 13:25, 8 Sep 2005 (EDT)
Katrina Survivor
http://www.katrinasurvivor.net/find.cfm?PageNum_GetAll=1&sort=name
Records: 2,151 posts
Status: need to scrape
Contact: webmaster [at] katrinasurvivor.net (Joe Bykowski)
Notes: Emailed for participation on Tues 9/6 4:21pm PST
Response:
Aaron:
I truly appreciate your offer to include KatrinaSurvivor.net's databases in a unified survivor database through a PFIF feed. At this time, however, KatrinaSurvivor.net will be unable to participate in any such project.
I wish you the best of luck with this effort.
Joe Bykowski
Validation: Passed with the following messages:
Schema validating with XSV 2.10-1 of 2005/04/22 13:10:49
- Target:
- Real name:
- Length: 2731730 bytes
- Last Modified: Thu, 08 Sep 2005 13:15:39 GMT
- Server: Apache/2.0.54 (Debian GNU/Linux) mod_jk2/2.0.4 PHP/4.3.11-0.dotdeb.1
- docElt: {http://zesty.ca/pfif/1.1}pfif
- Validation was strict, starting with type [Anonymous]
- schemaLocs: http://zesty.ca/pfif/1.1 -> http://zesty.ca/pfif/1.1/pfif-1.1.xsd
- The schema(s) used for schema-validation had no errors
- instanceAssessed: true
- No schema-validity problems were found in the target
Geeklibrarian 18:17, 8 Sep 2005 (EDT)
Hurricane Refugees
http://www.hurricanerefugee.com/names.asp
Records: 2,129
Status: YES!
Contact: content[at]hurricanerefugee.com / Greg VanDell egvandell [at] hotmail.com
Notes: Emailed for participation on Tues 9/6 4:32PM PST
Response: Please send any ASP code if available. If you only have PHP thats fine, I can just convert it. thx, -g
I can create an XML feed for you guys. I've been attempting to get in touch with the Red Cross to coordinate these efforts, but haven't had any success with them yet. I'm getting an large amount of hits (already received a quarter million), we could use this site as a portal. I'm running SQL Server and ASP...don't know what you guys are on.
Have you been able to link any of the other sites yet? Let me know if you guys want something more than an XML feed.
Thanks,
Greg Van Dell
Forest Hills, NY
Response2: (AaronPava 13:42, 7 Sep 2005 (EDT))
I have a realtime PFIF feed - for access please email me at -- content [at] hurricanerefugee.com
Hurricane Katrina Missing List
http://www.gwid.com/katrina.php
Records: 1,669
Status: Willing to help, Developer is on mailinglist.
Contact: kirk [at] gwid.com
Notes: emailed for status Tues 9/6 2:32 PST
Response: Yeah any help you can give will be helpful. I have been very busy trying to find a job in Montgomery AL. The database is MySQL with a PHP Front End.
Response2: Sure but give me a day I will have a new interface up that allows for more options. I am working with someone now on developing some search features along with photo postings. Thanks
Response3: I am forwarding the contact info with I guy that I am working with out of Cal. My link will be forwarded to his. His front end is better than mine and we will continue to upgrade it as necessary. He can help with setting up PFIF.
nik [at ]monkeymaximus.com his name is nik
thanks, kirk
Tulane Safe Registry
http://www.scribedesigns.com/tulane/
Records: 1,502
Status: needs help implementing
Contact: Harley Robertson frontdoor2[at]scribedesigns.com IM:sparrowhawk12345
Notes: Emailed for participation on Tuesday 9/6 11:25pm PST :: AaronPava 02:26, 7 Sep 2005 (EDT)
Response: contacted me by IM at 11:40pm PST :: AaronPava
Response2: K, I'm all for providing a feed - just dun know how.
It shouldn't be hard, we just have a simple MySQL DB - I kinda cheated and just added a table to a database with other stuff to get it up faster though, so I would like to work on whatever I need to work on myself - I'd just like some sample scripts, or whatever is involved, and I'll customize and install them. My ICQ is : 20345371 Y!,AIM,MSN : sparrowhawk12345
-Harley Robertson
Notes: Scraped at Google, (9/15) where they found 1933 records. Scraped XML uploaded to WIKI. Contact: pasztor at gmail dot com
InfoZone New Orleans Missing
http://www.theinfozone.net/NOLAmissing2.html
Records: 2962 (10/30 14:30PM EST)
Status: Added to list on 9/7 10:30PM PST
Contact: katrina[at]theinfozone.net (has been contacted)
Harrison County Missing/Inquired About Persons
http://co.harrison.ms.us/assistance/missing/
Records: 1132
Status: Needs to be scraped
Contact: webmaster@co.harrison.ms.us
Note: Contains only list of people who have been asked about. There's a list of confirmed fatalities at http://co.harrison.ms.us/assistance/confirmed/
CNN Safe List
http://www.cnn.com/SPECIALS/2005/hurricanes/list/
Records: 1,120
Status: Scraped,
Contact: hurricanevictims[at]cnn.com
Notes: Send email for participation request on Tues 9/6 4:04pm
Missing Katrina
http://callhome.textamerica.com/
Records: 669
Status: might be difficult; photos are good, but data seems limited
Contact: callhome.123 [at] tamw.com
Notes: Sent an email requesting participation Oasisbob 02:59, 7 Sep 2005 (EDT)
Hurrican Survivors.org
http://www.hurricanesurvivors.org/database.html
Records: 596
Status: Willing to help.
Contact: valenkim [at] hotmail.com
Notes: emailed for status Tues 9/6 2:37 PST
Katrina Survivor Database
http://katrina.streetlampsoftware.com/
Records: 456
Status: Being scraped by Gabe Wachob (gwachob@wachob.com)
Contact: katrina [at] streetlampsoftware.com
Notes: Emailed to see about PFIF participation on Tuesday 9/6 2:53 PST
NCMEC Hurricane Katrina Children
http://www.missingkids.com/missingkids/servlet/PageServlet?LanguageCountry=en_US&PageId=2077
Excel spreadsheet, direct link is http://www.missingkids.com/en_US/documents/katrina.xls
Records: 334
Status: Devin waiting for info on PFIF id from Ping
Contact: "Contact Us" link times out on their site. Oasisbob
emergency-database
http://www.emergency-database.com/guide/
Records: 200 (does it really even have this many? -- wayward)
Status: ?
Contact: INFO[at]EMERGENCY-DATABASE.COM
Notes: Emailed requesting participation on Wed 9/6/05 at 7PM Seems less structured that it appears
Survivor Registry
http://www.survivorregistry.com/cgi-bin/show_all.pl
Records: 193
Status: Sent an email inviting participation. Oasisbob 03:06, 7 Sep 2005 (EDT)
Contact: survivorregistry [at] gmail.com
Notes: "Message" field has name of found, names of missing; rather unstructured data
Search for Missing People
http://www.searchformissing.org/
Records: 80
Status: ?
Contact: leepaulmartin [at] gmail.com
Notes:How do you get data out of Google Maps AJAX ??
Looks like there are 80 points in the map and 92 rows in the table (which contains only the names of the searcher and the missing person, so may not be useful). I think the person who entered 184 as the number of records was counting each name in the table as a record rather than each row. The data for the 80 Google map points is actually in the JavaScript in the HTML page and could be parsed out of there. It's just names and addresses, though. --KCIvey 09:05, 7 Sep 2005 (EDT)
I will work on this tonight. Please email me at mmondok@clariondata.com if someone beats me to it. I am looking at putting out an XML feed once it is done. For now, note that the AJAX will just spit out structured, client-side script in the source that can be parsed with regex. --Mmondok 12:06, 7 Sep 2005 (EDT)
It doesn't look like this site is actually using AJAX -- there is no XMLHttpRequest object or any derived object. Look at the JavaScript line that starts:
"html = "Lozano, Leonard
4646 Demontlizan
";var point =..."
It looks like each marker is created with a call to the createMarker function, passing lat, long, and the info. So you can just parse up that line, or copy the javascript, then modify the createMarker function to do something more useful, like a document.writeln in CSV format, then you could just copy off your webbrowser. Let me know if you have any other questions about the Google Maps API or AJAX. -- ZBerke 10:23, 7 Sep 2005 (PST)
I worked on this last night. The map itself implements AJAX but the people are static, which makes things much easier. I am hoping to finish the scraping today. --Mmondok 08:23, 8 Sep 2005 (EDT)
The site is scraped and I have the people, but I am trying to match up the people searching for them. I have to tend to family matters this weekend, but I will make available what I have. --Mmondok 03:06, 9 Sep 2005 (EDT)
Find Our Family
Records: ?
Status: ?
Contact: Yorweb.com Inc, 3256 Yates St, Bartlett TN 38134, Ph. Local 381 1715, 1-901-381-1715
Notes: Referenced in the Memphis Commercial Appeal as website set up by local company.
Operation Kare
Records: 23,000+
Status: Excel spreadsheet available from site; conversion to PFIF in progress.
Contact: Kay A. Doggett, 607 Belvue Road, Travelers Rest, SC 29690, work: 864-573-1643, cell 864-982-5396, < debugdiva @ gmail.com >
SafeKatrina.com
Records: 120?
Status: being scrapped
Contact: joe at gmail dot com
Notes:
wecaretexas
Records: 216172
Status: Scraped at Google, validated, uploaded to WIKI.
Contact: pasztor at gmail dot com

