Help understanding ADO codes and similar to get data quick. ACE JET PeculiaritiesAka. WTF is the ACE and other Queer Stuff taking so long while a Jet went like “Billy Ho”( This Thread follows on a bit from this thread https://www.excelforum.com/excel-pro...ml#post4630973 )
( I had to change the title above a bit after my last after thought gave some really unexpected results *** ??? )
I am looking for some general comments and advice here. I will try and keep it brief, but I have cracked my head on this one a lot over the last couple of weeks...
Very briefly what I am trying to do:
I have a big main file already with lots of data. I may add some more. But I want to keep it manageable. So I am trying to find a good compromise with having in addition to the Main File, some data in other data files which I can add to my main file fairly quickly. That way I can
_ add to,
_ replace, or
_ just use temporarily in the main file , then delete.
That is the way I want to go about things,... for now. So I would want that to work fairly quick , in “real time as it were”. That is why I have dome some detailed experiments comparing the times of different codes to bring data in.
So I have looked quite thoroughly at getting data from the other data files, in particular whilst keeping those data files closed.
After taking advice, ( and listening to my own ), I went very mainly along the ADO way of doing things.
But some of the results for time taken were disappointing so I started again...
I have done more codes and time tests than I can remember, so I will just summarise briefly here and reference some “ appendix” posts elsewhere where I look at each code in much more detail..
A bit more detailed background info to what I am doing is given here:
https://www.excelforum.com/developme...ml#post4619363
To simplify for the purposes of this Thread, let’s just say I want to get data from a file that is in a long column in a worksheet and put it in a VBA variant Member element type Array , arrBk().
Codes I have tried so far:
some initial simple codes
I suspect that finally I will get the data from a closed workbook, but just as a start point and for a frame of reference I done some simple ones for comparison
_ Simple capture of the data into the Array if the File was already open ( so this sets my best “hoped for” standard as it were )
https://www.excelforum.com/developme...ml#post4619759
_ Timing opening the data file in the existing Excel , filling the array, then closing.
https://www.excelforum.com/developme...ml#post4619960
_ Again timing opening, capturing the data and closing, but using a second instance object of Excel
https://www.excelforum.com/developme...ml#post4619980
Ms Queer Stuff
_ I did a lot of codes which ended up being great learning but only final novelty value based on a maco driven Ms Query. A lot of detail in this Thread ( https://www.excelforum.com/excel-pro...-xl-files.html ). Some example codes starting from about here:
https://www.excelforum.com/developme...ml#post4620321
And going on to around here:
https://www.excelforum.com/developme...ml#post4626489
Ending up finally with this code to use in my speed test
https://www.excelforum.com/developme...ml#post4626501
_ then I went on to the three typical ADO type codes.
A tried quite a few variations, here a few:
_ Looping through records ( rows)
https://www.excelforum.com/developme...ml#post4626697
_ Using the VBA Range object CopyFromRecordset method
https://www.excelforum.com/developme...ml#post4626707
_ Using the ADO GetRows() Method
https://www.excelforum.com/developme...ml#post4626709
At his point a few things were clear.....
It’s the opening of the ADO connection that takes all the time. I do not quite understand that. ( The opening of the record set is quite quick, as is any simple processing after. )
I cannot see any noticeable difference in the 3 different ADO ways I did . ( That goes for the codes I have done for myself, or in answering Threads on this- I often practiced by trying a few ADO ways, - I never noticed any difference in times ). They all use the same opening of the ADO connection, so that dominates )
I am puzzled what where makes the data “table as it were”.
I am thrown off as always by Object Orientated Programming terminology. If the ADO connection is an object, then is it a Highway, like ADOConnectHighway , and it takes a while as it is the best Interplanetary Highway imaginable that I could transport all my computer data along in one go. ??? Which I don’t need
After this I wondered off into looking at
Codes based on using the unique fixed vectors for a single cell over an extended range, ( in a closed workbook reference ).
( In plain text simple links to closed workbook).
Starting from about here
https://www.excelforum.com/developme...ml#post4626713
and going on to about here
https://www.excelforum.com/developme...ml#post4628606
It surprised me that the simple closed workbook links worked better than all the codes so far.. and I wondered why that work at all. Is it an “internal Excel thing”, fundamental to how excel works on spreadsheet values held in a simple help matrix somewhere, which coincidently needs little modification to get at the matrix of a different Workbook. That I can understand I think. I do not believe we really have an “Open” Workbook ever. We work on a pseudo copy of the workbook and reference the data from the cache held of the data.
( That in turn has a ( dangerous or potentially very useful ) advantage that when I pass my main Workbook on to someone, but do not give them the closed data workbook , then links and formulas including multi cell range references to the closed workbook, will still work , as can access ( potentially all ) the data from that no longer available data workbook. This is because that data is still to be found in the Cache held in the main Workbook of the last referenced data
( http://www.excelforum.com/excel-prog...ml#post4319226
https://www.excelforum.com/developme...ml#post4482591
https://www.excelforum.com/developme...ml#post4492413 )
_....
Anyway , after all that last lot of codes, I thought I had something really clever with my “Codes based on using the unique fixed vectors for a single cell over an extended range, ( in a closed workbook reference )”
I was just going to write another book on it, Lol... when , just for fun I tried a go at a ADO JET instead of the nowadays more common ADO ACE... I couldn’t get a code to work on my data file unless I resaved it as .xls, ( Excel 97 – 2003 stylio ) –
( Working jet Code ( needed .xls File ) https://www.excelforum.com/developme...ml#post4636087 )
I note here that any ADO code does not return me more than 255 columns anyway.. so I have not lost too much..( My Files are typically a couple of thousand columns, so over pre XL 2007 255 column limit, but rows would never, I think exceed pre XL 2007 65536 row limit )
ADO JET provider alternative
Taking a look at a quick set of speed measurements here _...
https://www.excelforum.com/developme...ml#post4636533
https://www.excelforum.com/developme...ml#post4636716
_... the JET code nocks the socks off all the others. ???
So .. WTF is going on .. ?
Is the answer that the .Open on a “Connection” to a .xlsx or .xlsm file somehow makes the whole massive worksheets available resulting in that very long time compared to the JET code which necessarily needs a .xls file to work, and therefore has to “open up” less or “lay open” less?.
( I assume I cannot use JET on a .xlsx or .xlsm file - I have not found a connection string that works yet. )
Can anyone shed any light on this, - like for example does no one use ADO on any Excel File ?? – I know I have on answering Threads, but in a more demanding situation it seems totally useless compared with even a simple closed reference.
_.____________
To summarise the help I am asking for in this Thread..
_1) Any general advice on the best way to go about what I am doing.
_ 2) Any simple to understand explanation of what is going on with the codes I have done, in particular
_ 2a) ADO ACE JET stuff: what is actually happening internally with all this .Opening of a connections and the .Opening of the record set , - what is the subtle difference. ( If can get a better understanding then I might see the best way to do what I am doing )
_ 2b) My “Codes based on using the unique fixed vectors for a single cell over an extended range, ( in a closed workbook reference )” – As I understand it somewhere the last ( see the references I gave above ) lot of data pulled in from a closed range reference is held in some XML thingy file, “held in the Cache” .. – if I could tap on to that it would be handy .... – according to those references I gave, it will do that automatically sometime use that Cached info. anyway . But if I could purposely access that it could be useful ... This seems to be an area that no one has investigated...
_3 ) The last JET versus ACE issue ???? Can I not either get JET to work on my .xlsx and ..xlsm files or somehow control ACE that is does not take so long, ( I am guessing it is something to do with the Connection “object” thing, as that is what seems to be causing all the grief )
Thanks
Alan
( No rush on this, - I have been looking at this for a while, and probably still will be for a little while .. .. . )
Bookmarks