Welcome Guest!
In order to take advantage of all the great features that Waterloo Region Connected has to offer, including participating in the lively discussions below, you're going to have to register. The good news is that it'll take less than a minute and you can get started enjoying Waterloo Region's best online community right away.
or Create an Account




Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Recovering old WW threads
#1
I've been trying to recover as much of WonderfulWaterloo.com as I can over the past few days, and I've set up a central repository on Github so we can organise all the scraped content and eventually import it to WRC.

Check it out, and let me know if you have any other ideas about how to recover that data. The more the merrier!

https://github.com/samnabi/wonderful-waterloo-recovery

Here's a section of the README:

Quote:How can I help?

1. Upload pages/images from your local cache. If you visited WonderfulWaterloo.com recently before it was taken offline, you may have local copies of those pages saved in your browser's cache. Use one of the methods below to save anything you can related to wonderfulwaterloo.com. Even if it seems irrelevant, don't delete it.

Once you've located your local cache of WonderfulWaterloo pages/assets, please submit them as a pull request so I can add them here. Or, if you're not familiar with git, get a hold of me at sam@samnabi.com.

2. Use Warrick to scrape various web archives. Warrick is a command-line Perl utility for recovering websites from your local cache, archive.org, Google cache, Bing cache, et al. I have already used this tool with some success, but multiple attempts by different machines may deliver more results.

Please submit a pull request for any data you recover with Warrick. There will be a lot of duplicates, but we can cross that bridge when we get there. No need to filter stuff out on your end.
Reply


#2
Has there been much success in trying to recovery any of the old data from WW?
Reply
#3
Not a whole lot. Still working behind the scenes to get more data, but I think I've got all the low-hanging fruit already. I'm hoping to get it archived properly with some help from UW.
Reply
#4
I've updated the GitHub page with more ways that people can help recover data from WW. Please do check  it out if you can lend a hand.

So far, 5955 individual posts across 126 threads have been recovered from the forums, along with 471 images.

The posts can be accessed as a big JSON file here: https://github.com/samnabi/wonderful-wat...23453.json

The images can be accessed here: https://github.com/samnabi/wonderful-wat...set_images

Still looking for support from the universities to archive these properly.
Reply
#5
Thanks so much for all of your hard work in recovering these. While we might not ever have them integrated on WRC, at least they're not totally lost.
Reply
#6
Sorry to revive an old thread, but there's a browseable version of those old posts here: https://wonderfulwaterloo.samnabi.com/
Reply
#7
Thanks for putting this together.

From a personal perspective, definitely mixed feelings looking back at all that. From being there from the very start to my exodus (sadly I found the thread that eventually led to it) to the downfall of it all. Bittersweet for sure.

Sad that such a valuable resource was lost, but thrilled that what's turned out to be a more successful version in WRConnected was born out of it.
Reply
« Next Oldest | Next Newest »



Forum Jump:


Users browsing this thread: 1 Guest(s)

About Waterloo Region Connected

Launched in August 2014, Waterloo Region Connected is an online community that brings together all the things that make Waterloo Region great. Waterloo Region Connected provides user-driven content fueled by a lively discussion forum covering topics like urban development, transportation projects, heritage issues, businesses and other issues of interest to those in Kitchener, Waterloo, Cambridge and the four Townships - North Dumfries, Wellesley, Wilmot, and Woolwich.

              User Links