Waterloo Region Connected
Recovering old WW threads - Printable Version

+- Waterloo Region Connected (https://www.waterlooregionconnected.com)
+-- Forum: Connected Café (https://www.waterlooregionconnected.com/forumdisplay.php?fid=15)
+--- Forum: Forum Issues (https://www.waterlooregionconnected.com/forumdisplay.php?fid=20)
+--- Thread: Recovering old WW threads (/showthread.php?tid=50)



Recovering old WW threads - samnabi - 08-28-2014

I've been trying to recover as much of WonderfulWaterloo.com as I can over the past few days, and I've set up a central repository on Github so we can organise all the scraped content and eventually import it to WRC.

Check it out, and let me know if you have any other ideas about how to recover that data. The more the merrier!

https://github.com/samnabi/wonderful-waterloo-recovery

Here's a section of the README:

Quote:How can I help?

1. Upload pages/images from your local cache. If you visited WonderfulWaterloo.com recently before it was taken offline, you may have local copies of those pages saved in your browser's cache. Use one of the methods below to save anything you can related to wonderfulwaterloo.com. Even if it seems irrelevant, don't delete it.

Once you've located your local cache of WonderfulWaterloo pages/assets, please submit them as a pull request so I can add them here. Or, if you're not familiar with git, get a hold of me at sam@samnabi.com.

2. Use Warrick to scrape various web archives. Warrick is a command-line Perl utility for recovering websites from your local cache, archive.org, Google cache, Bing cache, et al. I have already used this tool with some success, but multiple attempts by different machines may deliver more results.

Please submit a pull request for any data you recover with Warrick. There will be a lot of duplicates, but we can cross that bridge when we get there. No need to filter stuff out on your end.



Re: Recovering old WW threads - rangersfan - 09-24-2014

Has there been much success in trying to recovery any of the old data from WW?


RE: Recovering old WW threads - samnabi - 10-22-2014

Not a whole lot. Still working behind the scenes to get more data, but I think I've got all the low-hanging fruit already. I'm hoping to get it archived properly with some help from UW.


RE: Recovering old WW threads - samnabi - 01-11-2015

I've updated the GitHub page with more ways that people can help recover data from WW. Please do check  it out if you can lend a hand.

So far, 5955 individual posts across 126 threads have been recovered from the forums, along with 471 images.

The posts can be accessed as a big JSON file here: https://github.com/samnabi/wonderful-waterloo-recovery/blob/master/json/threads_1421023453.json

The images can be accessed here: https://github.com/samnabi/wonderful-waterloo-recovery/tree/master/raw_data/subset_images

Still looking for support from the universities to archive these properly.


RE: Recovering old WW threads - Spokes - 01-13-2015

Thanks so much for all of your hard work in recovering these. While we might not ever have them integrated on WRC, at least they're not totally lost.


RE: Recovering old WW threads - samnabi - 10-04-2018

Sorry to revive an old thread, but there's a browseable version of those old posts here: https://wonderfulwaterloo.samnabi.com/


RE: Recovering old WW threads - Spokes - 10-04-2018

Thanks for putting this together.

From a personal perspective, definitely mixed feelings looking back at all that. From being there from the very start to my exodus (sadly I found the thread that eventually led to it) to the downfall of it all. Bittersweet for sure.

Sad that such a valuable resource was lost, but thrilled that what's turned out to be a more successful version in WRConnected was born out of it.