• Ultraviolet Photography
  •  

Archiving Websites

11 replies to this topic

#1 Andrea B.

    Desert Dancer

  • Owner-Administrator
  • 7,184 posts
  • Location: USA

Posted 30 October 2018 - 06:47

I tried to use my iPad to add an update to the original Archiving Websites topic that was in this section and managed instead to delete it. Touch screens, foo! I don't even know how I messed up. It just happened! I hope I did not disappear the topic right in the middle of someone reading it.

Here's the update: I played around this afternoon with the SiteSucker app. I archived about 4GB of our forum onto my laptop before the program ran into some snag. I am happy to report that SiteSucker cannot access protected areas of a forum. I really didn't think it (or other apps like it) could do that, but now I know for sure that they cannot.

Somewhere I read a warning that you have to be careful how you set link following in these archiving apps so that you prevent sucking up the entire Internet. <lolling>

With SiteSucker's "local html" archiving, I could access UVP entirely offline on my Macbook and that was pretty cool. There were some links to stuff outside UVP that don't work, of course. But the pages and topics and photos were all there and links within UVP all worked - at least the ones I tested worked. Couldn't try them all of course.

I want to try using SiteSucker or wget to archive by section so I can save the Botanicals separately from the Technical areas. This is because it is very time consuming to archive an entire website in one session. Additionally, I was plagued with timeouts which I think might have been caused by the fact that SiteSucker was hitting on UVP so intensely that I think some kind of throttling was triggered. So breaking up the website into 3 or 4 separate archives might be the way to go.
Andrea G. Blum
Often found hanging out with flowers & bees.

#2 Cadmium

    Member

  • Members(+)
  • 2,602 posts

Posted 30 October 2018 - 08:38

Why are you archiving it?
Are you preparing for 'the final cut' ? (a reference to Bjorn's old final departure post on Nikon Cafe...many moons ago)
Is this what people do when they move to New Mexico?
I'm scared... It reminds me of the final scene from Terminator...
https://youtu.be/5C6GZQ7UNaU

Edited by Cadmium, 30 October 2018 - 09:05.


#3 bobfriedman

    Member

  • Members
  • 324 posts
  • Location: Massachusetts

Posted 30 October 2018 - 10:42

this site may do it for you... https://web.archive.org

#4 nfoto

    Former Fierce Bear of the North

  • Owner-Administrator
  • 2,310 posts
  • Location: Sørumsand, Norway

Posted 30 October 2018 - 11:19

They do snapshots. Andrea probably wants a complete searchable backup of UVP?

#5 Andrea B.

    Desert Dancer

  • Owner-Administrator
  • 7,184 posts
  • Location: USA

Posted 30 October 2018 - 15:29

Mostly it is that I want to know how to do this. It's not that I'm actually archiving for any particular reason at the moment.

Eventually I would like to preserve the botanical info in some way. And I've had other people ask me about preserving the technical information. I liked the idea of preserving any UVP info in the form of html pages which could reside as a sort of reference on a laptop without having to be connected to the internet.

Someday I will need to retire from running websites, yes? It might be nice to save all this work. When I'm 105 I'll sit in my rocking chair and look at the archived floral UV-signatures and remember all the wildflower safaris I went on. :rolleyes: B) :lol:
Andrea G. Blum
Often found hanging out with flowers & bees.

#6 bobfriedman

    Member

  • Members
  • 324 posts
  • Location: Massachusetts

Posted 30 October 2018 - 15:47

View Postnfoto, on 30 October 2018 - 11:19, said:

They do snapshots. Andrea probably wants a complete searchable backup of UVP?

understand... but i have been able to recover product pdf's that are long gone using this archive.. seems the snap shot contains hot links to the archive.. (was looking for older Zeiss XF.2 tech data sheets to compare with Milvus and was able to retrieve them)

#7 Andrea B.

    Desert Dancer

  • Owner-Administrator
  • 7,184 posts
  • Location: USA

Posted 30 October 2018 - 16:58

Not all archiving tools "follow" links which is typically how pdfs are presented.
Andrea G. Blum
Often found hanging out with flowers & bees.

#8 dabateman

    Da Bateman

  • Members(+)
  • 885 posts
  • Location: Maryland

Posted 31 October 2018 - 02:28

Seems like a good idea.
Reminds me of this:
https://xkcd.com/1909/

Also reminds me or the printed Wikipedia version that was released and relevant for a day.

I didn't think you were allowed to retire Andrea. Weren't there plans to keep your head in a jar and for ever work on the site?

#9 Andrea B.

    Desert Dancer

  • Owner-Administrator
  • 7,184 posts
  • Location: USA

Posted 31 October 2018 - 02:43

So I’ve heard !!! <laughing>

That xkcd frame was spot on. Too funny. Too true!
Andrea G. Blum
Often found hanging out with flowers & bees.

#10 Pedro J. Aphalo

    Pedro J. Aphalo

  • Members
  • 76 posts
  • Location: Helsinki, Finland

Posted 07 December 2018 - 11:46

View Postbobfriedman, on 30 October 2018 - 10:42, said:

this site may do it for you... https://web.archive.org
Yes, it does. In fact several snapshots of uvp are available, oldest one from 2004.Just search for the site URL. There is a long gap. Maybe you had settings to discourage crawling by robots.

#11 bobfriedman

    Member

  • Members
  • 324 posts
  • Location: Massachusetts

Posted 07 December 2018 - 12:52

there is a more general site.. https://archive.org that has a library as well.

#12 Andrea B.

    Desert Dancer

  • Owner-Administrator
  • 7,184 posts
  • Location: USA

Posted 02 January 2019 - 18:05

No, we don't have any supression of crawlers at this time. Although I am often tempted to do that when I see Bingbot (or others) lingering here for days.

The Wayback machines do not store all pages of a site.
Andrea G. Blum
Often found hanging out with flowers & bees.