| Forum Home | ||||
| Press F1 | ||||
| Thread ID: 88586 | 2008-04-01 17:49:00 | Best software to scrap the online content from website | prayami (13094) | Press F1 |
| Post ID | Timestamp | Content | User | ||
| 655023 | 2008-04-01 17:49:00 | Hi, I want to scrap or scan or grab( not sure the proper word ) one website. And put the contents in the excel file and save on my system. I want to set different criteria for scrapping, like, scrap HTML table with this heading etc, if possible. Is there any free software available for that? Please also let me know which software is best to buy in case the free one doesn't have good features. Thanks in advance, |
prayami (13094) | ||
| 655024 | 2008-04-01 18:00:00 | httrack - my first choice or Blackwidow or Teleport pro. None will grab "everything" depending on the site if it has certain scripts or database type things. |
Bantu (52) | ||
| 655025 | 2008-04-01 18:13:00 | Thanks for reply, I read the features of httrack and Blackwidow but I couldn't find the feature to save in excel file. All they do is copy whole website and put it offline to our system. But I want to save some data in some format like excel etc. Thanks, |
prayami (13094) | ||
| 655026 | 2008-04-01 20:09:00 | I doubt whether you'll find any that export to Excel, possibly a few that do Word. Why do you need it in Excel? Most likley you'll need to save the page offline as with HTTrack etc and then manually copy the bits you want into Excel. |
autechre (266) | ||
| 655027 | 2008-04-01 20:41:00 | I know there are some softwares available. But before I buy, I want some expert's advice. I searched the net and I found: Web Scraper Plus+ 5 www.download.com Thanks, |
prayami (13094) | ||
| 655028 | 2008-04-01 20:46:00 | What do you exactly want to insert in excel from a web page? - script codes, comment codes, links, tags, photo's, etc or what?... As mentioned by autechre above, excel would be uncommon for storing web components (if that's what you are considering) - even by web masters... Importantly, what do you plan to do with this content? I certainly would not "grab" topical/photo content, without contacting the web master/owner of the site - if you intend to post it on the internet, or publish publicly by other means. I contacted the site owner to use his NZ photo's for one of my sites, as long as I left a link to his site... You could consider a compiler, where you can convert html, etc to pdf - more commonly known as a e-book...often a important tutorial reference created by web owners... |
kahawai chaser (3545) | ||
| 655029 | 2008-04-01 21:46:00 | I am not going to get anything without owners' consent. I am not going to get photos. Mostly I may require to get HTML Tables' data in excel. e.g. Address Detail List Movie Songs Detail List Products Detail List and specification May require to get anything in the table form on the website. Thanks, |
prayami (13094) | ||
| 655030 | 2008-04-01 23:26:00 | I see prayami - tables. I guess many site visitors would manually copy them; right click, etc. Though there are probably scripts on the web somewhere, though I believe there are no popular software to do so. You could visit sourceforge (sourceforge.net/), the open source site and maybe try a search on digital point (http:) forums (where hundreds of programmers/web designers hang out). The term scraping generally refers to programs/people that auto collect rss feeds from sites that provide rss subscriptions. They then auto drop those feeds (e.g. latest news/content/updates, etc) into their own sites... |
kahawai chaser (3545) | ||
| 655031 | 2008-04-01 23:45:00 | You can do it from within Excel. Data > Import External Data > New Web Query Browse to the page you want, then in the query browser window, you can tick the tables or portions of the page that you want saved in excel and hit OK (or done or whatever the button is). There are other options that you can set within that window too. Once you've got it in excel you can run formula etc., and can set it to update on a regular basis if desired (to keep the data up-to-date). Mike. |
Mike (15) | ||
| 1 | |||||