+ Reply to Thread
Results 1 to 21 of 21

Efficient VBA code for conditionally deleting rows in big excel sheet

  1. #1
    Registered User
    Join Date
    07-06-2017
    Location
    India
    MS-Off Ver
    2016
    Posts
    7

    Efficient VBA code for conditionally deleting rows in big excel sheet

    My excel sheet has 40 columns and more than 1,00,000 rows. I want to delete all the rows which contain cell with a string "NA" in any of the columns. I am struggling to find an efficient VB code for this, which doesn't cause excel to crash. My current VBA code (explained below) takes forever to run (>5 mins on Intel Xenon and 16 GB RAM) and crashes on slower machines (i5, 4 GB RAM). Any suggestions to streamline and make it faster?

    P.S. Exact number of rows and columns are not known apriori. And, I'm new to VBA, any help would be greatly appreciated.

    Link to the concerned file - https://drive.google.com/file/d/0Bzl...ew?usp=sharing

    My VBA code:

    Please Login or Register  to view this content.
    The same question is also available at : VBAexpress & Mrexcel
    Last edited by akash27; 07-07-2017 at 03:20 PM. Reason: Added Code Tags

  2. #2
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,258

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Hello akash27,

    Welcome to the forum!

    So we can answer your questions quickly and correctly, please take some time to read the forum rules.

    You obviously are need of an answer as soon as possible because you also posted this same question at VBA Express. Their rules are similar to ours. Please let us know when you posted the same question in another forum when little time has past between the postings.
    Sincerely,
    Leith Ross

    Remember To Do the Following....

    1. Use code tags. Place [CODE] before the first line of code and [/CODE] after the last line of code.
    2. Thank those who have helped you by clicking the Star below the post.
    3. Please mark your post [SOLVED] if it has been answered satisfactorily.


    Old Scottish Proverb...
    Luathaid gu deanamh maille! (Rushing causes delays!)

  3. #3
    Forum Guru xladept's Avatar
    Join Date
    04-14-2012
    Location
    Pasadena, California
    MS-Off Ver
    Excel 2003,2010
    Posts
    12,378

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Try this:

    Please Login or Register  to view this content.
    Last edited by xladept; 07-06-2017 at 04:53 PM. Reason: Quicker Version
    If I've helped you, please consider adding to my reputation - just click on the liitle star at the left.

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~(Pride has no aftertaste.)

    You can't do one thing. XLAdept

    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~aka Orrin

  4. #4
    Registered User
    Join Date
    08-08-2014
    Location
    Lancaster, PA
    MS-Off Ver
    2016 (windows & mac)
    Posts
    94

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    give this a go

    Please Login or Register  to view this content.
    takes about 8 seconds to complete on my system
    Last edited by thatandyward; 07-06-2017 at 10:36 PM.

  5. #5
    Forum Expert
    Join Date
    12-10-2006
    Location
    Sydney
    MS-Off Ver
    Office 365
    Posts
    3,527

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    A data matrix that is more than 1,000,000 rows long and 40 columns wide is pushing Excel to it's limits. That said try this non-looping method while on the tab in question (initially on a copy of your data as the results cannot be undone if they're not as expected):

    Please Login or Register  to view this content.
    Regards,

    Robert
    ____________________________________________
    Please ensure you mark your thread as Solved once it is. Click here to see how
    If this post helps, please don't forget to say thanks by clicking the star icon in the bottom left-hand corner of my post

  6. #6
    Forum Guru
    Join Date
    08-15-2004
    Location
    Tokyo, Japan
    MS-Off Ver
    2013 O.365
    Posts
    22,652

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Please Login or Register  to view this content.

  7. #7
    Forum Expert
    Join Date
    02-14-2009
    Location
    .
    MS-Off Ver
    ................
    Posts
    2,840

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Your post does not comply with Rule 8 of our Forum RULES. Do not crosspost your question on multiple forums without including links here to the other threads on other forums.

    Cross-posting is when you post the same question in other forums on the web. The last thing you want to do is waste people's time working on an issue you have already resolved elsewhere. We prefer that you not cross-post at all, but if you do (and it's unlikely to go unnoticed), you MUST provide a link (copy the url from the address bar in your browser) to the cross-post.

    Expect cross-posted questions without a link to be closed and a message will be posted by the moderator explaining why. We are here to help so help us to help you!

    Read this to understand why we ask you to do this, and then please edit your first post to include links to any and all cross-posts in any other forums (not just this site).

  8. #8
    Registered User
    Join Date
    08-08-2014
    Location
    Lancaster, PA
    MS-Off Ver
    2016 (windows & mac)
    Posts
    94

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Jindon - I like this approach, much better/faster to process as a text file; however I think your RegEx pattern is incorrect, this leaves lines where NA is only in the last column i.e. there is no trailing comma. the pattern "^.*NA,.*?\r\n|^.*,NA\r\n" would remove all occurrences of NA including those only in the last column.

    this approach also got me thinking about how else to achieve the goal, as this is a CSV file, parsing as text at the command line is likely the most efficient way; Jindon's code effectively does this, just from within Excel.

    if you're on a Unix machine (Linux or OS X) then you can use SED (stream editor) from the terminal. this processes the changes in under a second!

    sed '/NA/d' ahs-comb-madhya_pradesh-dhar.csv > ahs-comb-madhya_pradesh-dhar_CLEAN.csv

    I believe you could use PowerShell on Windows to do something similar but am not very familiar with it, the below command appears to work, although takes about 7 seconds, so not as fast as Jindon's code.

    Get-Content .\ahs-comb-madhya_pradesh-dhar.csv | Where-Object {$_ -CNotMatch 'NA'} | Set-Content ahs-comb-madhya_pradesh-dhar_CLEAN.csv
    Last edited by thatandyward; 07-07-2017 at 12:08 PM.

  9. #9
    Registered User
    Join Date
    07-06-2017
    Location
    India
    MS-Off Ver
    2016
    Posts
    7

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Shows run-time error '13': Type mismatch
    & did you get 8 sec completion time on the 40 MB excel file which I have shared on google drive?

    Quote Originally Posted by thatandyward View Post
    give this a go

    Please Login or Register  to view this content.
    takes about 8 seconds to complete on my system
    Last edited by akash27; 07-07-2017 at 03:33 PM.

  10. #10
    Registered User
    Join Date
    08-08-2014
    Location
    Lancaster, PA
    MS-Off Ver
    2016 (windows & mac)
    Posts
    94

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Quote Originally Posted by akash27 View Post
    Shows run-time error '13': Type mismatch
    & did you get 8 sec completion time on the 40 MB excel file which I have shared on google drive?
    works for me on the 40MB CSV file from the google drive link. are you copying this code into a new module and then running?

    yes, it takes 8 secs to complete for the 40MB file

  11. #11
    Forum Expert
    Join Date
    02-14-2009
    Location
    .
    MS-Off Ver
    ................
    Posts
    2,840

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Value for money - there's at least 20 people replying to this on 4 different forums. So much for forum "rules" about adding links.

  12. #12
    Forum Guru xladept's Avatar
    Join Date
    04-14-2012
    Location
    Pasadena, California
    MS-Off Ver
    Excel 2003,2010
    Posts
    12,378

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Did you try my routine? What happened??

  13. #13
    Registered User
    Join Date
    07-06-2017
    Location
    India
    MS-Off Ver
    2016
    Posts
    7

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    No, I'm running the code in Sheet 1 only.
    Quote Originally Posted by thatandyward View Post
    works for me on the 40MB CSV file from the google drive link. are you copying this code into a new module and then running?

    yes, it takes 8 secs to complete for the 40MB file

  14. #14
    Registered User
    Join Date
    07-06-2017
    Location
    India
    MS-Off Ver
    2016
    Posts
    7

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Thanks! this is the fastest method (perhaps because Windows allow PowerShell to use all cores, whereas multithreading is not available in VB)
    Quote Originally Posted by thatandyward View Post
    Jindon - I like this approach, much better/faster to process as a text file; however I think your RegEx pattern is incorrect, this leaves lines where NA is only in the last column i.e. there is no trailing comma. the pattern "^.*NA,.*?\r\n|^.*,NA\r\n" would remove all occurrences of NA including those only in the last column.

    this approach also got me thinking about how else to achieve the goal, as this is a CSV file, parsing as text at the command line is likely the most efficient way; Jindon's code effectively does this, just from within Excel.

    if you're on a Unix machine (Linux or OS X) then you can use SED (stream editor) from the terminal. this processes the changes in under a second!

    sed '/NA/d' ahs-comb-madhya_pradesh-dhar.csv > ahs-comb-madhya_pradesh-dhar_CLEAN.csv

    I believe you could use PowerShell on Windows to do something similar but am not very familiar with it, the below command appears to work, although takes about 7 seconds, so not as fast as Jindon's code.

    Get-Content .\ahs-comb-madhya_pradesh-dhar.csv | Where-Object {$_ -CNotMatch 'NA'} | Set-Content ahs-comb-madhya_pradesh-dhar_CLEAN.csv

  15. #15
    Registered User
    Join Date
    08-08-2014
    Location
    Lancaster, PA
    MS-Off Ver
    2016 (windows & mac)
    Posts
    94

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Quote Originally Posted by akash27 View Post
    No, I'm running the code in Sheet 1 only.
    it should still work in the Sheet1 module; have you tried running it with a 'fresh' file?

    I have just re-verified the code works with the original download. these are the steps I took.

    1. download CSV from the link you provided
    2. open CSV in excel
    3. go to developer tab & click Visual Basic button
    4. open up the Sheet1 code module
    5. paste my code into sheet1 code module
    6. close Visual Basic editor
    7. click Macro button
    8. select Sheet1.DeleteNAs and click run

    the code then successfully runs without any errors.
    Last edited by thatandyward; 07-07-2017 at 05:20 PM.

  16. #16
    Registered User
    Join Date
    07-06-2017
    Location
    India
    MS-Off Ver
    2016
    Posts
    7

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    causes Excel to crash (even on Xenon machines)
    Quote Originally Posted by xladept View Post
    Did you try my routine? What happened??

  17. #17
    Registered User
    Join Date
    08-08-2014
    Location
    Lancaster, PA
    MS-Off Ver
    2016 (windows & mac)
    Posts
    94

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Quote Originally Posted by akash27 View Post
    Thanks! this is the fastest method (perhaps because Windows allow PowerShell to use all cores, whereas multithreading is not available in VB)
    glad it worked.

    performing operations in memory is always faster than inside an instance of excel as there is less overhead to deal with. my earlier code achieves similar performance as it loads the data into an array and processes it in memory then writes it out to the sheet all at once.

    Jindon's code also manipulates the data in memory, his code is even faster (ran at about 5 sec on my machine)

    using SED on a Unix machine is particularly impressive as it can process the changes in under a second!

  18. #18
    Registered User
    Join Date
    07-06-2017
    Location
    India
    MS-Off Ver
    2016
    Posts
    7

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Same error Type mismatch
    Quote Originally Posted by thatandyward View Post
    it should still work in the Sheet1 module; have you tried running it with a 'fresh' file?

    I have just re-verified the code works with the original download. these are the steps I took.

    1. download CSV from the link you provided
    2. open CSV in excel
    3. go to developer tab & click Visual Basic button
    4. open up the Sheet1 code module
    5. paste my code into sheet1 code module
    6. close Visual Basic editor
    7. click Macro button
    8. select Sheet1.DeleteNAs and click run

    the code then successfully runs without any errors.

  19. #19
    Forum Guru
    Join Date
    08-15-2004
    Location
    Tokyo, Japan
    MS-Off Ver
    2013 O.365
    Posts
    22,652

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    Quote Originally Posted by thatandyward View Post
    Jindon - I like this approach, much better/faster to process as a text file; however I think your RegEx pattern is incorrect, this leaves lines where NA is only in the last column i.e. there is no trailing comma. the pattern "^.*NA,.*?\r\n|^.*,NA\r\n" would remove all occurrences of NA including those only in the last column.
    Should work as it is.

    akash27,

    Mark the thread as "SOLVED" and NEVER do a cross posting without links.
    Last edited by jindon; 07-07-2017 at 10:04 PM.

  20. #20
    Registered User
    Join Date
    07-06-2017
    Location
    India
    MS-Off Ver
    2016
    Posts
    7

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    In the same CSV file if I want to delete all the entries for age<18. The fastest VB code would first sort the "age" column, delete all the rows till it hits 18 and then shift the cells upward.
    @thatandyward In your opinion, could this compare and delete operation be done in a faster way through PowerShell?

    Quote Originally Posted by thatandyward View Post
    Jindon - I like this approach, much better/faster to process as a text file; however I think your RegEx pattern is incorrect, this leaves lines where NA is only in the last column i.e. there is no trailing comma. the pattern "^.*NA,.*?\r\n|^.*,NA\r\n" would remove all occurrences of NA including those only in the last column.

    this approach also got me thinking about how else to achieve the goal, as this is a CSV file, parsing as text at the command line is likely the most efficient way; Jindon's code effectively does this, just from within Excel.

    if you're on a Unix machine (Linux or OS X) then you can use SED (stream editor) from the terminal. this processes the changes in under a second!

    sed '/NA/d' ahs-comb-madhya_pradesh-dhar.csv > ahs-comb-madhya_pradesh-dhar_CLEAN.csv

    I believe you could use PowerShell on Windows to do something similar but am not very familiar with it, the below command appears to work, although takes about 7 seconds, so not as fast as Jindon's code.

    Get-Content .\ahs-comb-madhya_pradesh-dhar.csv | Where-Object {$_ -CNotMatch 'NA'} | Set-Content ahs-comb-madhya_pradesh-dhar_CLEAN.csv

  21. #21
    Registered User
    Join Date
    08-08-2014
    Location
    Lancaster, PA
    MS-Off Ver
    2016 (windows & mac)
    Posts
    94

    Re: Efficient VBA code for conditionally deleting rows in big excel sheet

    it's always going to be faster to perform the operations in memory rather than using excel functions.

    it is possible to do this with Powershell, similar to how we removed all the lines containing 'NA', we just need a slightly more advanced RegEx statement.

    ^(?:[^,]*\,){4}([0-9],|1[0-7]) will match all lines where the digits after the 4th comma are less than 18, so we can put that into the same Where-Object -CNotMatch statement

    Get-Content .\ahs-comb-madhya_pradesh-dhar_CLEAN.csv | Where-Object {$_ -CNotMatch '^(?:[^,]*\,){4}([0-9],|1[0-7])'} | Set-Content ahs-comb-madhya_pradesh-dhar_CLEAN_OVR18.csv

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. [SOLVED] Deleting rows conditionally
    By pewe in forum Excel General
    Replies: 3
    Last Post: 01-13-2017, 01:54 PM
  2. [SOLVED] How can I conditionally add up time in a more efficient manner
    By sherlock99 in forum Excel Formulas & Functions
    Replies: 4
    Last Post: 05-06-2016, 09:47 AM
  3. Changing code from deleting rows to cut/paste rows into another sheet and delete blank row
    By kmarshall6576 in forum Excel Programming / VBA / Macros
    Replies: 4
    Last Post: 04-18-2013, 01:54 AM
  4. Conditionally Copy rows to new sheet in Excel 2007
    By souvick in forum Excel Programming / VBA / Macros
    Replies: 15
    Last Post: 11-16-2011, 09:23 AM
  5. VBA - Conditionally deleting rows
    By Phosphonothioic in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 10-19-2011, 06:38 PM
  6. Conditionally deleting rows
    By alta14 in forum Excel Programming / VBA / Macros
    Replies: 4
    Last Post: 10-20-2009, 04:28 PM
  7. Help with Excel VBA code - Deleting Rows
    By gavinbf in forum Excel Programming / VBA / Macros
    Replies: 1
    Last Post: 01-30-2005, 07:02 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1