HtmlEditor :  Phorum 5 The fastest message board... ever.

This is the discussion forum for the HtmlEditor. See also the HtmlEditor home page, where you can download the control, and the Documentation Wiki, a collaborative project for documenting the control.

Goto Thread: PreviousNext
Goto: Forum ListMessage ListNew TopicSearchLog In
webbrowser questions
Posted by: FTA (---.ri.ri.cox.net)
Date: Tuesday, 17-Dec-2002, 21:16:53

I am currently writing code that uses the IE controls to navigate to a web page and then the DHTMLDOM to populate password etc. In effect i am automating the IE browser to accomplish repetetive tasks (like downloading bank activity). However, when i come across a web page with frames and the frames come from different domains i am denied access tyo the frame which does not come from the parent document domain. This is probably due to the security feature which does not allow for scripting accross domains - but the effect is that i am denied access to those frames that come from a different domain than the parent document. I cannot even do a document.frames(x).length (where x is the frame number of the frame i am trying to access). I get permission denied.

The workaround would be to get the document elements etc without using the DHTML DOM. Can anyone tell me how to do this in VB.net. This program seems to do it in c. But i am fairly new to OOP and very new to vb.net.

I appreciate any help i can get here - please throw this dog a bone.

Re: webbrowser questions
Posted by: Tim Anderson (---.server.ntl.com)
Date: Wednesday, 18-Dec-2002, 07:47:36

The HtmlEditor hosts the MSHTML part of Internet Explorer as an ActiveX document. It allows easy access to the HTML source of the current document so you could grab that and parse it using your own routines rather than the DOM if that helps. Much of the same code can be adapted for use with the full webbrowser control too. Incidentally, it's C# rather than C.

You could try using the HtmlEditor's Navigate method to load the document at the URL required. Handle the ReadyStateChanged event and look for a ReadyState value of "complete", indicating that the document has loaded. Then get the source as a string using the GetDocumentSource() method. It will work fine with VB.Net.

Tim

Re: webbrowser questions
Posted by: FTA (---.ri.ri.cox.net)
Date: Thursday, 19-Dec-2002, 01:28:17

Sorry,

I'm a little confused. Do i put the editor and the webrowser controls on the same form?

How do i get the document that is currently loaded in the webrowser control?

Re: webbrowser questions
Posted by: Tim Anderson (---.server.ntl.com)
Date: Thursday, 19-Dec-2002, 14:09:36

No, I was suggesting not using the Webbrowser control at all.

Tim

Re: webbrowser questions
Posted by: FTA (---.ri.ri.cox.net)
Date: Thursday, 19-Dec-2002, 16:11:06

Thank you for your recent prompt response.

Please understand that i only have vb.net installed and not vs.net

I was able to import the htmleditor.cs code into a project and now i see the classes.

However - i don't see how to use the navigate method of the editor. I do this for fun, not for a living - so im a bit of a hack. The vb.net thing is really confusing to me since to date, i have been doing all my coding in excel VBA.

But i was able to complete my first project in vb.net - which uses the webbrowser control to navigate to a bank website and download the activity information from the document table to a csv file. I envy guys(and gals) like you who understand the "big picture" of OOP. But if i hack away at it long enough i'll get it eventually.

A little history here :
Since i can write programs that use the webbrowser control to accomplish these tasks why am i bugging you "power programmers"?

I own several restaurants and we take pride in reconciling our accounts promptly. To reconcile the bank accounts we must wait for the paper statement to arrive. Usually this happens on the tenth day of the month. Since we discover errors - not all the activity on the bank statements matches the activity in out ledger accounts - we must call the individuals at the restaurants and ask them about transactions that transpired as long ago as six - eight weeks prior. Wouldn't the better solution be to download the account activity weekly from the bank and check it against the ledger activity? Then any errors would be fresh in peoples minds.

Furthermore - i've already written code that takes a csv file of bank records and matches the records against transactions in the accounting program database. So all i have to do is get the bank data and the rest is cake. I'm talking about saving my bookkeepper (the only word in the english language with three double letters) many hours a week. Since the bookkeepper happens to be my wife - this frees up a lot more time for...well, you get the picture.

So back to the downloading bank records issue.
We use several banks. I have no problems getting the data programatically from any of them except one - and thats because the page contains frames and some of the frames are loaded from different domains - essentially the main document structure appears thusly:

<Frameset>
<Frame> source = here.mybank.com </Frame>
<Frameset>
<Frame> source = here.mybank.com</Frame>
<Frame> source = elsewhere.mybank.com</Frame>
</Frameset>
</Frameset>

(actual domain names hidden to protect the innocent civilians)

In this case, when i use the webbrowser control to access the elements in the main document or any of the frames from here.mybank.com - no problem.

But if i try to access any elements in the frame from elsewhere.mybank.com i get a permission denied error. Of course - that frame contains the activity links that i need to access.

I think this is an IE 6 security feature that is designed to prevent malicious websites from scripting across domains and gaining access to my computer. However the effect is that automating IE in sites that contains frames from different domains is not possible.

To further compound the problem, the links on that frame change with each session - since the bank sends an encrypted url that is session unique. So I cannot simply "hard code" the webbrowser to navigate to that link without first reading the url in the source view. I can view the source and see it with my eyes but 'permission denied' when trying to access that frame.

The workaround to the permission denied error then is simply a matter of reading the HTML content of that frame - looking for a unique sequence of data that will identofy the url. It looks to me that you figured out a way to view the source programatically - and since i'm a stubborn old fart who won't sleep at night till i figure this out - here i am.

Re: webbrowser questions
Posted by: Tim Anderson (---.server.ntl.com)
Date: Thursday, 19-Dec-2002, 16:31:37

Here is how you can get the source, although I don't know if it will solve your problem.

First, put an HtmlEditor control, a TextBox and a Button on a VB form. I set the TextBox.Multiline to True, and the Scrollbars to Vertical.

Next, I went into the code view and a private class variable as follows:

Private bShowSource As Boolean = False

Staying in code view, I select HtmlEditor1 from the left-hand dropdown, and from the right-hand dropdown choose ReadyStateChanged, to create an event handler. This is the code:

Private Sub HtmlEditor1_ReadyStateChanged(ByVal sender As Object, ByVal e As onlyconnect.ReadyStateChangedEventArgs) Handles HtmlEditor1.ReadyStateChanged
If bShowSource And (e.ReadyState = "complete") Then
bShowSource = False
Dim source As String = HtmlEditor1.GetDocumentSource
TextBox1.Text = source
End If
End Sub

Then I went back to Design view and double-clicked the button, to get a Click event handler. Here's the code:

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
bShowSource = True
HtmlEditor1.LoadUrl("http://www.microsoft.com/")
End Sub

If I run this code, and click the button, I get the source for Microsoft's home page in the text box.

I'm sorry, when I said the Navigate method I should have said LoadUrl.

Tim

Re: webbrowser questions
Posted by: Tim Anderson (---.server.ntl.com)
Date: Thursday, 19-Dec-2002, 16:38:20

...as a postscript, of course the problem with this is that you won't get the contents of any frames. However, you will find in the HTML source some line like this:

<frame name="somename" src="http://www.someurl" frameborder="NO" scrolling="AUTO" marginwidth="0" marginheight="0">

..if you extract the value of the src attribute you could then load "http://www.someurl" into the control and get its source.

Tim

Re: webbrowser questions
Posted by: FTA (---.ri.ri.cox.net)
Date: Thursday, 19-Dec-2002, 20:19:25

Thank you Tim for your response.

If only it were that easy!!! But I am up against a formidable opponent.

populating the password and loging in is no problem. Once the submitloginform event is fired the aforementioned page loads. Here - once more is the basic outline.

<Frameset>
<Frame> source = here.mybank.com </Frame>
<Frameset>
<Frame> source = here.mybank.com</Frame>
<Frame> source = elsewhere.mybank.com</Frame>
</Frameset>
</Frameset>


However, when I navigate to elsewhere.mybank.com i am redirected to the main login page!!! The frame does not load.

Here is the actual document source

<html><head><title>Fleet | Small Business Services: Online Banking</title></head>
<frameset ROWS='120,*' BORDER='0'>
<frame src='https://smallbiz.fleet.com/cgi-bin/isbsprd.dll/scripts/ jump/onlBankingTopNav.jsp?BV_SessionID=aaaaaaaaaaaaaaaaa& V_EngineID=aaaaaaaaaaaa.0' NAME='head' FRAMEBORDER='0' SCROLLING='No' MARGINWIDTH='0' MARGINHEIGHT='0'>
<frameset COLS='11,749,*' BORDER='0'>
<frame src='/files/sbs/onlBankingLeftFrame.html' NAME='side' FRAMEBORDER='0' SCROLLING='no' MARGINWIDTH='0' MARGINHEIGHT='0'>
<frame src='https://officelink-pb.fleet.com/scripts/WebObjects.dll/OfficeLink.woa/wa/reconnect?reconnect=4' NAME='body' FRAMEBORDER='0' SCROLLING='Auto' MARGINWIDTH='0' MARGINHEIGHT='0'>
</frameset>
</frameset>
</html>

the last frame is the problem and when i navigate to
[officelink-pb.fleet.com]
i get reconnected to the login page

Re: webbrowser questions
Posted by: Tim Anderson (---.server.ntl.com)
Date: Thursday, 19-Dec-2002, 21:07:29

It won't be easy. The framed page is generated by a server-side process, and can't be retrieved just from the url. You would have to give it the security information it expects as part of the request. Maybe there is some other way at getting at the frame source, I don't know.

BTW I don't advise posting the full source for something like an online banking page. I imagine the session IDs expire quickly, and that other checks are made on the request, but I've edited your post to scramble them a little anyway.

Tim

Re: webbrowser questions
Posted by: FTA (---.ri.ri.cox.net)
Date: Thursday, 19-Dec-2002, 22:39:33

Thank you Tim and also thaks for looking out for me.

The info i posted came from an expired session but i appreciate you looking out for me anyway.

I have been very frustrated by this problem.

This much I know - the urls i need are visible in the frame.
I can see the urls if i do a "view source" right click in that frame.
I can't get to the frame using the dhtml dom (permision denied)

So I think I have three choices:

1) Find some way to get the "stream" of html elements from IE while it is parsing. Isn't it possible to get the stream from some other class? Maybe the system.io or system.web class. The stuff is sitting in my memory - there must be a way to read it after it is decrypted. Anyone??? Any idea where to start. Any Idea where to ask???

2) Use another browser that makes its DOM available to me.
If i have to i will but i'm already familiar with the webbrowser control.

3)Program my own browser. I have a better chance of winning the lottery than accomplishing this task. It seems one would have to be an expert coder with a real fundamental knowledge of the operating system. I am not and i do not have the expertise.

Re: webbrowser questions
Posted by: Tim Anderson (---.server.ntl.com)
Date: Thursday, 19-Dec-2002, 22:57:48

OK, I've done a bit more research.

Each Frame object has its own IWebBrowser2 interface. This also implements IPersistStreamInit which is the COM interface we use to get the document source. So if you can a reference to the Frame object you should be able to do this. A complication is that according to MSDN there is a bug in some versions of IE which prevents this from working.

Unfortunately I don't have time to try this at the moment. There's some Delphi code here:

[www.experts-exchange.com]

which addresses exactly this problem. It should be possible to do this in C# or VB.Net as well.

It would be nice to build this functionality into the control if anyone has time to experiment...

Tim

Re: webbrowser questions
Posted by: javaga (61.141.196.---)
Date: Friday, 20-Dec-2002, 13:57:31

How to create a null web project??
I create a null web project,and have set web site at IIs,But I run this project,Only the first web can be opened.and all of the controls can not be used ?why?
Thanks

Re: webbrowser questions
Posted by: Tim Anderson (---.server.ntl.com)
Date: Sunday, 22-Dec-2002, 16:52:21

Hi Javaga,

This looks more like a general web project question. I'd try asking in the dotnet newsgroups.

Tim



Your Name: 
Your Email: 
Subject: 
Spam prevention:
Please, enter the code that you see below in the input field. This is for blocking bots that try to post this form automatically.
vVbL6
This is a moderated forum. Your message will remain hidden until it has been approved by a moderator or administrator
This forum powered by Phorum.