Question description:
This user has given permission to use the problem statement for this
blog.
I have a spreadsheet with information on thousands of contacts. Lots of information is mismatched such as phone number in the address column etc.. What is the best way to go about fixing this?
Solved by B. U. in 60 mins
This is the chat thread from the real Excelchat help session. It contains no private user
information.
Excelchat Expert
11/09/2018 - 06:13
Hi
Excelchat Expert
11/09/2018 - 06:14
Vlookup or index and match function could be used for matching them correctly
Excelchat Expert
11/09/2018 - 06:15
If possible please share the file, so that I can help you better
User
11/09/2018 - 06:15
Sure I will share the file.
User
11/09/2018 - 06:15
[Uploaded an Excel file]
Excelchat Expert
11/09/2018 - 06:17
Also, please explain more about the problem
User
11/09/2018 - 06:18
There are lots of errors in this spreadsheet such as duplicate names or addresses in the wrong category
Excelchat Expert
11/09/2018 - 06:19
Could you please give an example
Excelchat Expert
11/09/2018 - 06:21
There?
User
11/09/2018 - 06:21
Penneyo@Cox.Net Douglas Olson Olson Olson Olson Olson Olson 4808370127 4808370127 4808370127 4808370127 4808370127 4808370127 6028038333 Fountain Hills 1 8/18/16 21:32 8/18/16 21:32 10/4/17 12:10 Open House | Local Events Oct 6-8, 2017 04c037213f 124213057 1b3200c3aa
User
11/09/2018 - 06:21
row 4448
User
11/09/2018 - 06:22
row 443
User
11/09/2018 - 06:22
4443
User
11/09/2018 - 06:22
row 4439
User
11/09/2018 - 06:23
there are errors in those rows and similar errors all throughout the worksheet
User
11/09/2018 - 06:24
are you there?
Excelchat Expert
11/09/2018 - 06:24
ye
Excelchat Expert
11/09/2018 - 06:25
*yes
Excelchat Expert
11/09/2018 - 06:25
Was going throught the data
User
11/09/2018 - 06:25
okay
Excelchat Expert
11/09/2018 - 06:27
First of all, I will clean the columns
Excelchat Expert
11/09/2018 - 06:28
if any of the fields match First Name or Last Name, I would replace it with blank
Excelchat Expert
11/09/2018 - 06:28
Also, on more question I have - Did you pull this data from some server?
Excelchat Expert
11/09/2018 - 06:28
and saved it as csv file?
User
11/09/2018 - 06:29
yes I did
Excelchat Expert
11/09/2018 - 06:30
This solution is not excel related but I believe this could be solved while pulling data itself
Excelchat Expert
11/09/2018 - 06:31
If the data in the server contains "," in the fields then while saving it as CSV those values get split into other columns since the delimiter of CSV files are commas
Excelchat Expert
11/09/2018 - 06:31
So, saving them as txt file could solve the problem
Excelchat Expert
11/09/2018 - 06:32
In that way file would have tab as delimiter and not split the fields by commas
Excelchat Expert
11/09/2018 - 06:32
Does this make sense to you?
User
11/09/2018 - 06:34
the original file in the server has these errors aswell
Excelchat Expert
11/09/2018 - 06:34
okay
Excelchat Expert
11/09/2018 - 06:35
While going thorugh the data, I saw that mostly names have spilled into other columns so am replacing those with blanks at first
User
11/09/2018 - 06:36
Okay great, how are you going about doing that?
Excelchat Expert
11/09/2018 - 06:37
=IF(OR(D2=B2,D2=C2),"",D2)
Excelchat Expert
11/09/2018 - 06:37
I am using this formula
Excelchat Expert
11/09/2018 - 06:38
So, what will happen is if the filed matches first name or last name then it will replace it with blank otherwise retain the original value
Excelchat Expert
11/09/2018 - 06:39
Something like this
[Uploaded an Excel file]
User
11/09/2018 - 06:40
okay cool, where are you entering this formula?
Excelchat Expert
11/09/2018 - 06:41
I am creating new columns for each of the fields and entering the formula in the new column
Excelchat Expert
11/09/2018 - 06:41
If you look at the sheet, column E and G have the formula
User
11/09/2018 - 06:43
Okay I see
Excelchat Expert
11/09/2018 - 06:49
Here it is, I did similar thing for rest of the columns
[Uploaded an Excel file]
Excelchat Expert
11/09/2018 - 06:50
Please have a look and let me know if any modifications are required
User
11/09/2018 - 06:53
is there a reason that columns f and g are duplicates?
Excelchat Expert
11/09/2018 - 06:54
First is the original column and the second after treating it with formula
Excelchat Expert
11/09/2018 - 06:55
If you want, I can just keep the treated column and delete the old one
User
11/09/2018 - 06:56
yes that would be great
Excelchat Expert
11/09/2018 - 06:56
okay, working on it
User
11/09/2018 - 06:57
can you also filter out columns B and C and leave it blank if it is just the email repeated?
User
11/09/2018 - 06:57
from column A
Excelchat Expert
11/09/2018 - 06:58
sure
Excelchat Expert
11/09/2018 - 06:58
Basically, same thing apply it on B and C, also
User
11/09/2018 - 06:58
yes exactly
Excelchat Expert
11/09/2018 - 06:58
okay
Excelchat Expert
11/09/2018 - 07:02
Here it is
[Uploaded an Excel file]
User
11/09/2018 - 07:05
this one still has the email from column A repeated into columns B and C
Excelchat Expert
11/09/2018 - 07:05
okay, i might have attached the wrong file
Excelchat Expert
11/09/2018 - 07:05
Let me check
User
11/09/2018 - 07:06
okay
Excelchat Expert
11/09/2018 - 07:06
Also, which row is it happening, i will re-check it
User
11/09/2018 - 07:08
row 313 is the first place I see it
Excelchat Expert
11/09/2018 - 07:08
okay
User
11/09/2018 - 07:09
also in row 1415 it is putting (Last Name, First name) in column B instead of First name in column B and last name in column C
Excelchat Expert
11/09/2018 - 07:10
[Uploaded an Excel file]
Excelchat Expert
11/09/2018 - 07:10
Okay, for this I would have to perform other type cleaning
User
11/09/2018 - 07:11
Okay how does that work?
Excelchat Expert
11/09/2018 - 07:11
If it finds comma in the name then take instance after the comma in the first name and before comma in the last name
User
11/09/2018 - 07:11
okay, what is the formula for this?
Excelchat Expert
11/09/2018 - 07:12
It will use find function
Excelchat Expert
11/09/2018 - 07:13
where it will find comma then use left function for the last name and combination of right and len functions for the first name
This is the output file from the real Excelchat help session:
This is an example of the expert help you can get. It contains no private user
information.