Forum Home
Press F1
 
Thread ID: 113606 2010-10-28 00:59:00 Help extracting a tarball ubergeek85 (131) Press F1
Post ID Timestamp Content User
1148274 2010-10-28 09:55:00 I guess I'm not quite expressing myself correctly; what I'm trying to do is make a simple HTML dump of wikipedia, with no executables required. Sure, it might be a few GB, but TBH, IDK. It's a project, not saying I'm going to succeed, not saying it's useful, or even the best way t go about it. Just a bit of fun.Aaah - I thought you wanted to build a mobile app. I suspect you'll still be disappointed though; there's no way to easily get what you've got down to a usable size and still keep it browsable.


Also, Filzip has finished, but it's only extracted as far as 1/3/6... not even into the a's.Interesting - sounds like it bailed for some reason.

Do you have a Linux machine handy (or a livecd, bootable USB drive etc)? There's nothing quite like tar for extracting tarballs...

tar -xf myfile.tar
Erayd (23)
1148275 2010-10-28 10:09:00 Interesting - sounds like it bailed for some reason.

I think it might have read the header wrong (or just ran out of space while trying to) because when it finished, the 'total progress' bar was right near the end, and it'd been moving as you would expect it to (slowly, oh so slowly).

I think I've got some Linux VM's lying round somewhere, although I might not need them - I tried opening the tar in 7zip from within the GUI (instead of from within explorer's context-sensitive menu), this time it's showing the progress while it tries to read the file TOC, so, it's a start. Extracting might be another matter though...

Ya'd think with GNU being GNU, and the GPL being the GPL, that there would be a proper windows port of tar (the binary). I found something similar (GnuWin32 tar), but not quite 'good' (refused to even take a whiff of the file). My guess is that since it hasn't been updated since 2003, it still has the old 8Gb limit of the tar format, and the 4Gb limit of the old fopen lib command.
ubergeek85 (131)
1148276 2010-10-28 10:41:00 Tar files don't have a TOC / header (at least not in the way you mean) - it was designed as a tape format. Files are concatenated as they're added to the archive, and there isn't a single spot in the file that contains all the metadata.

There are various windows ports of tar floating around, if you want such a thing...
Erayd (23)
1148277 2010-10-28 11:05:00 Oh dear; this happened after a few low memory warnings, and after I increased my page file size by 4Gb... oh dear... ubergeek85 (131)
1148278 2010-10-28 11:08:00 Sounds like 7zip is not the right tool for the job then - extracting a tarball shouldn't need much ram at all. Erayd (23)
1148279 2010-10-28 11:11:00 Oh, that wasn't even extracting it; that was just opening it to view the list of files!

Right... hmmm... I'll fire up a Linux VM tomorrow, it's getting a bit late for me now.
ubergeek85 (131)
1148280 2010-10-28 11:40:00 Oh, that wasn't even extracting it; that was just opening it to view the list of files!That still takes about the same amount of time - it still has to scan the entire 218GB tar file in order to list the contents. Erayd (23)
1148281 2010-10-28 18:19:00 Now I've tried to follow this thread with interest, but after the last post I'm still none the wiser. Wtf is a tarball. Only tarball I know sits on my car, and if I'm not careful on me.

sarel :lol::lol:
sarel (2490)
1148282 2010-10-28 22:15:00 Wtf is a tarball.

A tarball is one of the more common names for a tar archive. See here (en.wikipedia.org(file_format)).

You can think of it as pretty much the Linux / Unix equivalent of a zip file.

Note that tar files aren't always compressed, but they usually are. The most common compression methods are gzip, bzip2 and lzma. 7zip is an unusual choice.
Erayd (23)
1148283 2010-10-29 00:43:00 Excellent. Seeing that I'm starting to go into Linux as well I thank you for the info, now I also know.

sarel:lol:
sarel (2490)
1 2 3