Question description:
This user has given permission to use the problem statement for this
blog.
Hi I'm trying to use fuzzy lookup.
Not sure how it's working.
I have two strings which are different ('Unemployment Benefits' vs 'Invalid pensioner'), yet getting a similarity index of 0.9474.
Solved by E. E. in 40 mins
This is the chat thread from the real Excelchat help session. It contains no private user
information.
Excelchat Expert
26/03/2018 - 12:51
HI, hope you are doing well today!
Excelchat Expert
26/03/2018 - 12:51
Welcome to Got it Pro!
User
26/03/2018 - 12:51
Hi, thanks!
Excelchat Expert
26/03/2018 - 12:51
Could you let me know how you want the string to be searched?\
User
26/03/2018 - 12:52
I basically have two columns. They are supposed to be the same (but are text strings and have some discrepancies e.g. Other (Manager) vs Other (Managers)
User
26/03/2018 - 12:52
at
User
26/03/2018 - 12:52
Things like that
User
26/03/2018 - 12:52
So just want to do an approximate match
Excelchat Expert
26/03/2018 - 12:52
Ok.
Excelchat Expert
26/03/2018 - 12:53
Could you provide some exmaple here?
User
26/03/2018 - 12:53
But not sure how I should adjsut the settings
User
26/03/2018 - 12:53
Ok
User
26/03/2018 - 12:53
e.g.
User
26/03/2018 - 12:53
Column 1 'Other (Professionals)' vs columns 2 'OTHER (PROFESSIONAL)' gives me a similarity index of 0.9843
Excelchat Expert
26/03/2018 - 12:54
not sure what do you mean by index 0.9843
User
26/03/2018 - 12:54
While col1 'Unemployment Benefits' vs col2 'INVALID PENSIONER' gives me similarity 0.9474
User
26/03/2018 - 12:54
Similarity as in the output of the fuzzy lookup
Excelchat Expert
26/03/2018 - 12:54
But the vlookup function should give you the approxiate match criteia
User
26/03/2018 - 12:55
I'm using fuzzy lookup
User
26/03/2018 - 12:55
Is there a way to do approximate matching using just vlookup??
Excelchat Expert
26/03/2018 - 12:55
yes.
User
26/03/2018 - 12:55
Ohh could you please tell me how?
User
26/03/2018 - 12:55
Syntax?
User
26/03/2018 - 12:56
Or a link to some resource online?
Excelchat Expert
26/03/2018 - 12:56
VLOOKUP( value, table, index_number, [approximate_match] )
User
26/03/2018 - 12:56
I thought vlookup only gives exact matches
Excelchat Expert
26/03/2018 - 12:56
if you look at 4th parameter, it is showing approciate match
Excelchat Expert
26/03/2018 - 12:57
usually you will choose either 0 or 1 for the 4th parameter, which represent if it is approximate or exactly match
User
26/03/2018 - 12:58
Yes, thought that can't work in this case (from what i googled, it said I need to use fuzzy)
User
26/03/2018 - 12:58
col1 col2
Unemployment Benefits INVALID PENSIONER
User
26/03/2018 - 12:58
SO for the above
User
26/03/2018 - 12:58
If I want to do an if statement
User
26/03/2018 - 12:59
Where if the columns are similar, return a value of 1, else 0
User
26/03/2018 - 12:59
How do I do that?
User
26/03/2018 - 12:59
col1 col2
Unemployment Benefits INVALID PENSIONER
Other (Professionals) OTHER (PROFESSIONAL)
Excelchat Expert
26/03/2018 - 12:59
iferror(vlookup function here, "1","0")
Excelchat Expert
26/03/2018 - 01:00
iferror give you the flexibility to define what value you want to show if nothing show up
User
26/03/2018 - 01:01
But I want it to match with the value in column 1 from the same row
User
26/03/2018 - 01:01
Not search the entire table
Excelchat Expert
26/03/2018 - 01:02
then, you might need to use If statement rather than vlookup
User
26/03/2018 - 01:02
So you can use if statement for approximate match?
User
26/03/2018 - 01:02
is there a function for that?
Excelchat Expert
26/03/2018 - 01:03
yes
Excelchat Expert
26/03/2018 - 01:04
If(certain cell= "*this is the string*",0,1)
User
26/03/2018 - 01:04
but thats an exact match right
User
26/03/2018 - 01:04
not approx
Excelchat Expert
26/03/2018 - 01:05
no
Excelchat Expert
26/03/2018 - 01:05
that is wildcard search
Excelchat Expert
26/03/2018 - 01:05
since we include " * " sign inside
User
26/03/2018 - 01:05
Oh you have the *
User
26/03/2018 - 01:05
sorry didn't notice
User
26/03/2018 - 01:05
So you're saying where you have "*this is the string"
Excelchat Expert
26/03/2018 - 01:05
Yep. That will do the work for you
User
26/03/2018 - 01:06
can substitute col1 in there?
User
26/03/2018 - 01:06
So I don't actually have to type out the string each time
User
26/03/2018 - 01:06
But add cell reference?
Excelchat Expert
26/03/2018 - 01:06
Yes
Excelchat Expert
26/03/2018 - 01:06
That should work as well
Excelchat Expert
26/03/2018 - 01:07
Please feel free to let me know if you have any other questions. Otherwise, you can end the session at anytime :-)
User
26/03/2018 - 01:07
haha that didn't work
User
26/03/2018 - 01:08
=IF(AD2="*S2*",0,1)
Excelchat Expert
26/03/2018 - 01:08
the string section will need you to actually type in the portion you want to search for
User
26/03/2018 - 01:08
That's what I used
User
26/03/2018 - 01:08
Oh so I can't use a cell reference
User
26/03/2018 - 01:08
That's why I was using fuzzy lookup
Excelchat Expert
26/03/2018 - 01:08
All the character inside "" means the actual string you are looking for
User
26/03/2018 - 01:08
Because I can't manually type in the string for 50,000 records
User
26/03/2018 - 01:09
I want it to match with the other column
User
26/03/2018 - 01:09
Which will be different for each observation.
User
26/03/2018 - 01:09
TOo many records to manually enter strings in each row
User
26/03/2018 - 01:09
So could we go back to fuzzy lookup then?
Excelchat Expert
26/03/2018 - 01:13
I'm not sure where I can find the fuzzy lookup function in excel, but the if statement should just do the work if you are only search for one string that on the same row
User
26/03/2018 - 01:13
fuzzy is an add on
User
26/03/2018 - 01:13
o if doesnt work
User
26/03/2018 - 01:13
*no
User
26/03/2018 - 01:14
i dont want to manually type in a string...
User
26/03/2018 - 01:15
col 1 Manager
col2 Managers
User
26/03/2018 - 01:15
i want a third column to tell me how similar they are
User
26/03/2018 - 01:15
for about 30,000 rows
User
26/03/2018 - 01:16
each row with different strings being compared
Excelchat Expert
26/03/2018 - 01:16
=if(A1=left(B1,3),1,0)
User
26/03/2018 - 01:16
but thats not an approx match
User
26/03/2018 - 01:17
For my above example where the second column has managers with an S
User
26/03/2018 - 01:17
It will say the two columns don't match
User
26/03/2018 - 01:17
But they are very similar
User
26/03/2018 - 01:17
So I want it to show they do match approximately
User
26/03/2018 - 01:17
That way I can see the true mismatches
Excelchat Expert
26/03/2018 - 01:17
I dont think there is anyway to do this unless you define very strict what you are comparing with
User
26/03/2018 - 01:18
Yes
User
26/03/2018 - 01:18
That's why fuzzy lookup is used
User
26/03/2018 - 01:18
And I just wanted some help understanding it
User
26/03/2018 - 01:18
But if you're not aware of it then that's ok
User
26/03/2018 - 01:18
Thanks for your time
Excelchat Expert
26/03/2018 - 01:19
You think you can definitely search for some online content and some material should be avialable then
Excelchat Expert
26/03/2018 - 01:19
Thanks
This is the output file from the real Excelchat help session:
This is an example of the expert help you can get. It contains no private user
information.