I continue the discussion of how to download a Theremin World thread from here:
lets-design-and-build-a-mostly-digital-theremin Page=214
I've done it with a JAVA application. Souce code is here: ThereminWorldArchiver.zip
It downloads all selected paged and resources and stores them in a local cache. So the next time these already stored pages won't be loaded a second time.
To convert HTML to PDF I simply used Chrome.
I've also added these features:
- Title, Thread number, start and stop pages are stored in a config file.
- It is possible to exclude single posts or post from a specified user (Troll/spam)
- instead of the box on the left side with detailed user information, I've used a single horizontal line and the user name, date and number.
- Embedded youtube videos have been replaced by a preview image with link.
- If the first line is strong, it is used as h1 heading and with CSS it would be possible to add a page break before each h1 heading
Example of config script:
title: D-Lev_Theremin thread:28554 from:1 to:214 -host:http://illegal.images.com -user:Troll_1 -user:Troll_2 -post:42 -post:43 -post:44 -post:142 -post:143