HTML to Word
If there are no images, both Netscape and IE work fine - so the rest of this paper assumes that the HTML files have graphic images embedded. Creating one Microsoft Word document, that contains all the images without having to use OLE (Object Linking and Embedding) is the goal. If you use OLE, then you must have the main Word file, and all the images as seperate files. If you move the graphic files, or if the main document is emailed to someone without them - then Word will only display an error image with a square, triangle, and circle.
DO NOT USE NETSCAPE - Netscape does not save the images at all, so do not use it. I have not tried the newest version (6.0 I believe) - which "may" save them.
Use IE (Internet Explorer) and Word
IE allows you to save the entire web page, including the images. When Word opens and converts the HTML file - it will initially see the files as external links to your drive where the files reside, and therefore if you move or email the Word file, you lose the pictures, But - within Word, you can edit the links to cause them to instead, save all the pictures in the Word document.
NOTE : if you want to try this - I used the following web page to test this : http://www.dcbnet.com/notes/9611t1.html
Method 1 - Save the Web Page and Open in Word
1) open the page in IE and let it complete loading
2) File/Save As . . .
Save as Type: Web Page Complete
enter a name for the file, such as temp1.htm
Click Save, and select a temporary folder, such as c:\temp
NOTE: the main html file will be saved in c:\temp, and IE will
create a new folder under c:\temp and place the images there
3) open Word, and open the file, temp1.htm in Word
NOTE: you must have the HTML converter installed in your Word setup - most installs do have this feature). Word will convert the html file to Word format and will also link to the image files in the folder where IE saved them.
4) Edit/Links . . .
Select all the files and links by clicking one on the top one,
holding the Shift key down, and clicking once on the bottom one
Click the check box, "Save picture in document", and click OK
5) File/Save as . . . - and make sure to select Save as Type: Word Document (*.doc)
DONE ! !
Method 2 - Copy and Paste from HTML to Word
*** causes the "Line Feeds Problem" - aka the "Narrow Column" problem) - here we list a fix for this
*** if you want to try this, use any of the newsgroup postings. Go to www.google.com and click the "Groups" tab, then type in a search for anything, go to that page, and select the posting to copy. Of course, this works for any web page that has text !!
This fix is for conversion of an article on a web page - but NOT for conversion of images that may be present in the article. This is for a simple document that you want to convert to a text file or a Word file. In particular, this problem applies to all those helpful tips and methods of solving PC problems, that users post up on the NewsGroups (available by going to www.google.com and clicking "Groups").
If you have ever tried this - then you know the problem. The HTML web page has paragraph line feeds at the end of every line. The lines are short, so you and up with a lot of pages consisting of a narrow column. These line feeds are automatically inserted by News-Readers such as Free Agent when a user sends a post to a newsgroup. Here is an example - a portion of an actual post that I wanted to convert to a text file and save to my hard drive for future reference:
Example of the "Line Feeds" Problem (Narrow Column)
When you copy and paste the HTML into Notepad or Word - these annoying line feeds are pasted as well !! Word assumes that they were placed there by the author. It has no idea that newsreaders insert them automatically. Here we will describe how to get rid of them, resulting in the following:
Example of the "Line Feeds Problem" Fixed (full-width column)
You want to keep all the double paragraph line feeds, because they separate out the actual paragraphs - but you want to delete all the single paragraph marks that are the cause of the document being a narrow column. This way Notepad and/or Word will simply format the rows of text at full width. To do this we need to temporarily rename all instances of two consecutive paragrapgh marks to save them - then replace all sing paragraph marks, and then reclaim the double paragraph marks - as follows:
(this look like a long process - but it can be done in about one minute !!)