I need some help with logic to identify data that has duplicate IDs. First spreadsheet has various data and I need to extract the data that has the same name with the same ID and save it in another spreadsheet. I am having trouble visualizing this loop to extract the duplicate data.
Page 1 / 1
Hi @MarFeli,
You can achieve this using VB Script. Refer below script which requires source excel and destination excel file paths as input. It works for ID and Name cols, where ID placed in first column and Name placed in second column if you want to modify you can do it based on the script comments.
Input with duplicate rows for ID and Name:
VB Script code: Create a list variable in AA and add source file path in index ‘0’ and destination file path in index ‘1’. Pass this list variable as input parameter for script.
Function CopyDuplicateRows(param)
Option Explicit
Dim xlApp, xlBook, xlSheet, xlCopyBook, xlCopySheet, SourceFile, DestFile Dim lastRow, i, j, k Dim dict
SourceFile = param(0) ' your main file path with duplicates
DestFile = param(1) ' Replace with your copy/output file path
' Create an Excel Application object Set xlApp = CreateObject("Excel.Application") xlApp.Visible = False ' Set to True if you want Excel to be visible
' Open the main Excel workbook Set xlBook = xlApp.Workbooks.Open(SourceFile) Set xlSheet = xlBook.Sheets("Sheet1") ' Replace "Sheet1" with your sheet name
' Open the workbook where duplicates will be copied Set xlCopyBook = xlApp.Workbooks.Open(DestFile)
Set xlCopySheet = xlCopyBook.Sheets("Sheet1") ' Replace "Sheet1" with your sheet name in the copy workbook
' Get the last row number with data in column A (assuming Name is in column A and ID is in column in main workbook lastRow = xlSheet.Cells(xlSheet.Rows.Count, "A").End(-4162).Row ' -4162 is xlUp constant in VBA
' Create a dictionary object to store concatenated values Set dict = CreateObject("Scripting.Dictionary")
' Loop through each row in the specified range in main workbook For i = 2 To lastRow ' Assuming data starts from row 2 (adjust if needed) ' Concatenate Name and ID from main workbook Dim concatenated concatenated = xlSheet.Cells(i, 1).Value & "-" & xlSheet.Cells(i, 2).Value ' Adjust column indexes if needed for Name and ID
' Check if concatenated value already exists in dictionary If dict.Exists(concatenated) Then ' Highlight current cell and corresponding row in main workbook xlSheet.Cells(i, 1).Interior.Color = RGB(255, 0, 0) ' Red color xlSheet.Cells(i, 2).Interior.Color = RGB(255, 0, 0) ' Red color xlSheet.Rows(i).Interior.Color = RGB(255, 255, 0) ' Yellow color (for highlighting entire row)
' Copy the entire row to the copy workbook Dim copyLastRow copyLastRow = xlCopySheet.Cells(xlCopySheet.Rows.Count, "A").End(-4162).Row + 1 ' Find the last row in the copy workbook xlSheet.Rows(i).Copy xlCopySheet.Rows(copyLastRow) ' Copy entire row to the copy workbook Else ' Add concatenated value to dictionary dict.Add concatenated, 1 End If Next
' Save and close both workbooks xlBook.Save xlBook.Close xlCopyBook.Save xlCopyBook.Close
' Clean up Set dict = Nothing Set xlSheet = Nothing Set xlBook = Nothing Set xlCopySheet = Nothing Set xlCopyBook = Nothing xlApp.Quit Set xlApp = Nothing
End Function
Output Excel contains below rows:
Please give it a try and let me know if you have any queries.
Thanks,
Hemantha Pindra
While there isn’t an Excel Advanced package that does this for you, there is a Data Table: Remove duplicate rows. That might be useful for a Action only process to do find the non duplicate rows in memory, then use that in a loop to delete the rows that are in the Spreadsheet table.
In the end, the VB script that @HemanthaPindra provided is a similar algorithm, you could also use it as a reference to build that in with Packages and Actions.