Hi,
I need help to write a macro that creates a new 2D array with rows that have same values in 2 2D arrays.
For example:
Array1
1 2
3 4
5 6
7 8
9 10
Array2
1 2
2 3
4 5
5 6
7 8
Array3 (new 2D array to be created)
1 2
5 6
7 8
Thanks.
Hi,
I need help to write a macro that creates a new 2D array with rows that have same values in 2 2D arrays.
For example:
Array1
1 2
3 4
5 6
7 8
9 10
Array2
1 2
2 3
4 5
5 6
7 8
Array3 (new 2D array to be created)
1 2
5 6
7 8
Thanks.
e.g
Please Login or Register to view this content.
Jindon,
Sorry. Let me explain better.
I have 2 2D arrays (Array1 and Array2) and I need to create a new array (Array3) based on the analysis of these 2 arrays (rows with same values).
The code I need should be like:
Thanks.Please Login or Register to view this content.
Last edited by JOAO12; 04-24-2021 at 06:24 AM.
Hi,
could be easier with cells rather than arrays just with Excel basics …
With arrays you need to test each 'row' via a loop for example.
Hi, Marc L,
The actual arrays (Array1 and Array2) have 1 million rows each.
Other question: even if I use 2 ranges, I need to loop one range by the other, right?
All I need is a fast macro to analyze 2 ranges (or 2 arrays) with 1 million rows each and create a new range (or new array) with rows that have same values. The result (Array3 or a third range) should be written in Range("I1").
Array1 = Range("A1:B1000000").Value
Array2 = Range("E1:F1000000").Value
Write Array3 (or the result from range comparison) in Range("I1")
Thanks.
Last edited by JOAO12; 04-24-2021 at 07:59 AM.
For sorted ranges an optimization should be to work from columns concatenations …
Anyway for sorted arrays only, a VBA beginner starter :
PHP Code:
Sub Demo1()
V = [{1,2;3,4;5,6;7,8;9,10}]
W = [{1,2;2,3;4,5;5,6;7,8}]
ReDim X(1 To UBound(V))
L& = 1
For R& = 1 To UBound(V)
Do
If W(L, 1) = V(R, 1) Then
If W(L, 2) = V(R, 2) Then
N& = N& + 1
X(N) = Array(V(R, 1), V(R, 2))
If L < UBound(W) Then L = L + 1: Exit Do Else Exit For
End If
ElseIf W(L, 1) > V(R, 1) Or L = UBound(W) Then
Exit Do
End If
L = L + 1
Loop
Next
If N Then
ReDim Preserve X(1 To N)
X = Application.Index(X, Evaluate("ROW(1:" & N & ")"), [{1,2}])
Else
X = Split("")
End If
Stop ' see what X contains
End Sub
► Do you like it ? ► ► So thanks to click on bottom left star icon « ★ Add Reputation » ! ◄ ◄
Please Login or Register to view this content.
Jindon,
to avoid errors from one million rows data I would add a delimiter within the concatenation !
As comparing as it is for example {1,11} and {11,1} returns the same concatenation - 111 - but for different pairs …
Another point : as worksheet function Transpose is limited to 65K rows so it could be an issue if more than 65K pairs are matching !
Last edited by Marc L; 04-24-2021 at 12:15 PM.
Here is the code tested for 10 rows. The code does not restrict the number of rows and number of matching rows.
Change !0 to the required number. First trail on a copied file. Array need not be sorted one.
Please Login or Register to view this content.
Pl note
Array formula should be confirmed with Ctrl+Shift+Enter keys together.
If answere is satisfactory press * to add reputation.
Hi, Jindon, Marc L and kvsrinivasamurthy,
I would like to thank you all for your prompt reply.
Please, let me test your codes with the actual file.
Thanks.
Last edited by JOAO12; 04-24-2021 at 02:47 PM.
Hi JOAO12,
I was looking into this but had a question about duplicity.
Array1
1 2
3 4
5 6
7 8
9 10
A A
B B
B B
Array2
1 2
2 3
4 5
5 6
7 8
A A
B B
C C
Array3 (new 2D array to be created)
1 2
5 6
7 8
Would you get:
A A
B B
OR (Because it exist in both columns)
A A
B B
B B
Edited:
Please see attached on a set for 10,000 random numbers. Takes about a minute to run 10,000 data.
NOTE: This code does not accommodate for duplicity.
Please Login or Register to view this content.
Last edited by D13L; 04-24-2021 at 05:30 PM.
Hi D13L,
Both Array1 and Array2 have 5 rows and 2 columns and are filled with numbers.
I need to create a new array (Array3) with rows that exist in both arrays (Array1 and Array2), that is:
{1, 2} = First row in Array1 = First row in Array2
{5, 6} = Third row in Array1 = Forth row in Array2
{7, 8} = Forth row in Array1 = Fith row in Array2
Thanks for the code.
Last edited by JOAO12; 04-24-2021 at 05:54 PM.
I apologize, I must not be explaining myself well.
My code checks Arr1 and loops through Arr2 until it finds a match:
I've added a scenario to your example:
Array1
1 2
1 2
1 2
3 4
5 6
7 8
9 10
Array2
1 2
1 2
0 0
2 3
4 5
5 6
7 8
Array3 (new 2D array to be created)
1 2 <=== Does this get 2 or 3? (My code would produce 3, does not account for duplicity which array1 is top level for match)
5 6
7 8
Edit:
Hope this clarified it a bit better. I've added an temp array that would account for duplicates but have not implemented it in. Wasn't sure if duplicity was a factor.
Last edited by D13L; 04-24-2021 at 06:38 PM.
Anyway,
With the post#13 above 1st code:
12
12
12
56
78
Code appended for duplicity:
12
12
56
78
Please Login or Register to view this content.
D13L,
Now I understand the scenario you mentioned, but in Array3 (arr3) row {1, 2} needs to appear just once (no need to appear twice or three times).
For now, each array has no duplicate rows. So, for now I am not concerned with it. The goal is that the code runs as fast as possible.
Thanks.
Last edited by JOAO12; 04-24-2021 at 07:18 PM.
EDITED: ERROR IN CODE 11:02PM
EDITED: CORRECTED CODE AND UPDATE ATTACHMENT 11:12PM
CODE CORRECTED. UPADATED ATTACHMENT
Please Login or Register to view this content.
Last edited by D13L; 04-24-2021 at 11:12 PM.
Please Login or Register to view this content.
Sorry for post#16,
found error in the result/code
EDITED CORRECTED POST#16
Last edited by D13L; 04-24-2021 at 11:14 PM.
This code tested for 10 rows. This is faster.
Please Login or Register to view this content.
Last edited by kvsrinivasamurthy; 04-25-2021 at 05:14 AM.
The drawback with Dictionary object is it is fast until it reaches a number of elements then it becomes slower than a VBA Collection !
This critical threshold is between 400K & 500K elements.
We could proceed to some tests at least with an attachment saved as binary (.xlsb) …
Last edited by Marc L; 04-25-2021 at 06:25 AM.
Here we have another poster who concerns with the code posted from other poster...
No need to comment, just post your code.
Hi,
I edited post#16 which should be the corrected code. This is utilizing the "remove Duplicate" in excel.
Logic
1) Concatenate array1 then remove duplicate within array1
2) Concatenate array2 then remove duplicate within array2
3) Stack Array1 and Array Then remove duplicate with stacked array1 and array2
4) The ending result would be array3
Tried this with 1 mil random decimal numbers. Finished it in a couple of seconds.
Would you mind checking it out.
Please Login or Register to view this content.
Jindon, yet done according to the poor initial post … For next step as I wrote with an attachment …
Hi Marc L,
I'm assuming this comments is for me (D13L).
I'm self taught and really chose logic based on my limited knowledge. I really would appreciate some positive criticism. I wasn't sure if my method has any flaws since I learn by jumping in the lion's den .
I'd really appreciate if you can define the flaw. (I liked your perception above post really insightful and taught me well, like how my first attempt used the array loop which was really slow)
Thanks
Not for you, just well read …
Ohhh no. Sorry everyone. The code is no good. I forgot the most important step. COMPARE THE 2 ARRAYS.
PLEASE IGNORE POST#16 AND #22
Thank you.
...Blush
Looks like I can actually utilize the program still by...
LOGIC:
1) Concatenate array1 then remove duplicate within array1
2) Concatenate array2 then remove duplicate within array2
3) Stack Array1 and Array2 Then remove duplicate with stacked array1 and array2
4) #3 becomes arrayTemp
5) Sort arrayTemp A-Z
6) Sort the smallest of the two array1/array2
7) Compare
8) And the result should be array3
The only issue I find with this is; Does the original poster allows sorting?
DL13, as within the initial post the arrays are sorted but its 'explanation' seems not closed to its real need and not what any forum expects for …
So if you do not want to waste any more time, just wait at least for a complete & crystal clear explanation
- what is the destination of the result array ? If back to a new range, the array can be avoided … -
and as a bonus with an xlsb workbook containing at least 70K matching pairs …
MarcL,
Thanks for the suggestion. I'm REALLY ENJOYING this so I kept on going.
Also my logic above is completely flawed from #3-#8
New logic
1) Concatenate array1 then remove duplicate within array1
2) Concatenate array2 then remove duplicate within array2
3) removed array1 special chara(~ and .) and set to a cell range Creates a long number
4) removed array2 special chara(~ and .) and set to a cell range Creates a long number
5) Compare(MATCH) short array to the large array ==> gives me row number. THIS PORTION I AM NOT CONFIDENT ON ESPECIALLY IF THE Original Poster HAS A DIFFERENT SPECIAL CHAR. CODE DEPENDENT THAT IT IS A LONG NUMBER NOT A STRING
6) Sort it to A-Z ==> isolate blanks
7) Used Index to match to the original concatenate array
8) Split special character to the remainder array. THIS BECOMES ARRAY3
I tested it on 1 million random numbers(1-1000). Program ran around 200 seconds.
Hopefully this is good. This is my final attempt, I can't logically break it down any further.
Anybody interested in reviewing, I'd be glad for some constructive criticism. Doing these project, and getting sincere comments is the quickest way for me to learn.
Please Login or Register to view this content.
Hi all,
Sorry for not being able to post this message earlier.
I would like to thank all of you, Jindon, D13L, Marc L and Kvsrinivasamurthy, for sharing your knowledge.
D13L,
Unfortunately, I was not able to run the macro in “multiDim_Dup2nd attempt.xlsm” file. An error message appears, saying that “Run time error 1004 - Autofill Method of Range Class Failed”.
Jindon,
Could you please add a variable for last row number in the following line, so that the code looks like:
Change to:Please Login or Register to view this content.
Please Login or Register to view this content.
Kvsrinivasamurthy,
Unfortunately, when I tested your code with the actual data, the results did not match the expected results. Is it possible to correct the code? Please see the attached file “Test File – Kvsrinivasamurthy.xlsb”.
Marc L,
Your code took only 0.17 seconds to run in a worksheet with 65536 rows, but there was an error when testing with the actual data (subscript out of range). I was unsuccessful trying to fix it. Please see the attached file “Test File - Marc L.xlsb”.
Thanks
Last edited by JOAO12; 04-28-2021 at 02:46 PM.
Hi JOAO12,
It seems like it has auto-calculate off which is unable to copy formula down causing the error1004. I've updated the code to add manual calculate.
I've ran to your test sample data and it seems to run at around 18 sec.
I've ran this code on 1 mil random data with decimal points(=RANDBETWEEN(5,25-1)+RAND()) . Ran a little under 200 sec.
Please Login or Register to view this content.
According to this attachment my revised demonstration, result in columns I:J, should need less than a seoond (v2) :
PHP Code:
Sub Demo1r()
Dim T!, V, W, X, L&, R&, N&
T = Timer
V = [A1].CurrentRegion.Value2
W = [E1].CurrentRegion.Value2
ReDim X(1 To Application.Min(UBound(V), UBound(W)), 1)
[I1].CurrentRegion.Clear
L = 1
For R = 1 To UBound(V)
For L = L To UBound(W)
If W(L, 1) > V(R, 1) Then
Exit For
ElseIf W(L, 1) = V(R, 1) Then
If W(L, 2) > V(R, 2) Then
Exit For
ElseIf W(L, 2) = V(R, 2) Then
N = N + 1
X(N, 0) = V(R, 1)
X(N, 1) = V(R, 2)
Exit For
End If
End If
Next L, R
If N Then
[I1:J1].Resize(N).Value2 = X
Debug.Print "Demo1r : "; Format(Timer - T, "0.000s")
End If
End Sub
► Do you like it ? ► ► So thanks to click on bottom left star icon « ★ Add Reputation » ! ◄ ◄
Last edited by Marc L; 04-28-2021 at 07:11 PM. Reason: speed optimization …
According to the same attachment the Dictionary way, slightly slower in more than a second :
PHP Code:
Sub Demo2()
Const D = "¤"
Dim T!, V(1), C%, R&, N&
T = Timer
V(0) = [A1].CurrentRegion.Value2
V(1) = [E1].CurrentRegion.Value2
[I1].CurrentRegion.Clear
C = -(UBound(V(1)) < UBound(V(0)))
With CreateObject("Scripting.Dictionary")
For R = 1 To UBound(V(C)): .Item(V(C)(R, 1) & D & V(C)(R, 2)) = Empty: Next
C = 1 - C
For R = 1 To UBound(V(C))
If .Exists(V(C)(R, 1) & D & V(C)(R, 2)) Then N = N + 1: V(C)(N, 1) = V(C)(R, 1): V(C)(N, 2) = V(C)(R, 2)
Next
.RemoveAll
End With
If N Then
[I1:J1].Resize(N).Value2 = V(C)
Debug.Print "Demo2 : "; Format(Timer - T, "0.000s")
End If
End Sub
► Do you like it ? ► ► So thanks to click on bottom left star icon « ★ Add Reputation » ! ◄ ◄
Last edited by Marc L; 04-28-2021 at 08:22 PM. Reason: add clearing columns I:J
(duplicate)
Last edited by Marc L; 04-28-2021 at 07:35 PM.
Change to
Please Login or Register to view this content.
Hi JOAO12,
I've updated the code to assume no duplicity and removed redundancy in steps. It runs a little over 9 sec with your 100,00 data sample.
Runs to over 38 sec with 1,000,000 random data points.
Note: corrected fill down 'error1004'. Due to auto calculate was off. Added Manual Calculation in code.
EDITED: Updated Calculate to a specific sheet and Cell range only.Please Login or Register to view this content.
Last edited by D13L; 04-29-2021 at 08:23 PM.
Pl see post #19
Jindon, D13L, Marc L and Kvsrinivasamurthy,
Thank you for sharing your knowledge in this thread.
So may I suppose you have choosen my Demo1r ? If yes how many time it requires on your side ?
I have another way faster than a Dictionary for one million rows but not sure faster than Demo1r …
Hi Marc L,
Your macro Demo1r (post #32) takes 0.25 seconds to run.
Once again, thank you for sharing your knowledge in this thread.
0.25 on your sample of 107K (0.37 on my old tests laptop) but not for one million rows ?!
If for one million rows how many matching rows in the result ?
Hi Marc L,
For 1,000,000 rows, with 1,000,000 matching rows, it takes 1.38 second.
Whaou ! Thanks !
I'm a bit surprised of 1 000 000 matching rows so this is the result in columns I:J ?
Yes, it is. Thank you very much.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks