Question

**please write code with function definition taking in input and use given variable names** for e.g....

**please write code with function definition taking in input and use given variable names** for e.g. List matchNames(List inputNames, List secRecords)

Java or Python

Please Note:

   * The function is expected to return a STRING_ARRAY.

     * The function accepts following parameters:

     *  1. STRING_ARRAY inputNames

     *  2. STRING_ARRAY secRecords

     */

Problem Statement

Introduction

Imagine you are helping the Security Exchange Commission (SEC) respond to anonymous tips. One of the biggest problems the team faces is handling the transcription of the companies reported by the callers. You've noticed that sometimes the company name is misheard by the person taking the call, sometimes it is simply mistyped, and sometimes both. These problems make it more difficult to search the SEC records to identify the company.

You have access to the list of transcribed company names and the database of SEC records. We need a way to effectively translate company names based on their transcriptions so we can narrow our search results to the one company we are interested in.

Input

You will receive a string array representing the list of transcribed company names.

Each string in the array takes the following form:

  • The string will contain a company name comprising a set of words separated by spaces.
  • You can assume company names will only use alphabetic characters -- no numbers, no punctuation.

You will also receive a string array representing the database of SEC records.

  • Each string will be of the form <Company Name>;<Company EIN> (company name string and company EIN string separated by a semicolon). An EIN is a federal tax identification number used to uniquely identify each company.
  • As above, the company name half of the string will contain a set of words separated by spaces.
  • The EIN half of the string comprises 2 integers, a dash, and 7 more integers, in that order. Example, "12-3456789".
  • There will be no semicolons anywhere in the company name or EIN strings.

You may also make the following assumptions about the structure:

  • There will be at most 1000 companies in the SEC database.
  • There will be at most 50 company names in the input.
  • No company names or EINs will be repeated in either the input or the database.

Output

For each transcribed company name in the input string array, you want to match that to a company name (first part of a string) in the SEC database. The second part of the string in the SEC database will represent the company's EIN. Your output should also be a string array, this time representing the EINs mapped to the names in the input string array. You may assume that every input name will match a name in the SEC records.

Responding to Calls

The Basics

Let's start with the first step: making sure that if the name is transcribed perfectly, we match that company's record in the database right away. This will give you an idea of how to match company names in our system and what the output array should be. This will also show you how the input is structured if you desire to make your own custom inputs. The input comes in the form of two string arrays, where the first line represents the length of the array. An example is below.

Input

3

Pear Computers

Construct An Ursus

Planetary Technologies

3

Pear Computers;54-1264938

Construct An Ursus;58-1481332

Planetary Technologies;19-3563561

Output

["54-1264938", "58-1481332", "19-3563561"]

Your code should pass test cases 0, 1, and 2 after solving this step.

Misspellings

The second thing we want to look for are basic misspellings due to the transcriber hearing the company name correctly but missing a keystroke or pressing the wrong key instead. Think "Harveys Steakhouse" turns into "Harfeys Sreakhouse" or "Sugar and Sugar" turns into "Sugra and Sugar". In the first example, the transcriber missed the "v" key and hit "f" instead, and missed "t" and hit "r" instead. In the second, the transcriber accidentally typed "r" before "a". You should pass test cases 3 through 8 after solving this problem. Hint: looking up the phrase "string edit distance" in a search engine should be of some help to you here.

Input

3

Pewar Computers

Consuct A Ursuus

Planteray Techniligies

3

Pear Computers;54-1264938

Construct An Ursus;58-1481332

Planetary Technologies;19-3563561

Output

["54-1264938", "58-1481332", "19-3563561"]

Metaphones

The last and trickiest instance of transcription comes in the form of arbitrary misspellings resulting from the transcriber either hearing the name correctly and using a different spelling than the one in our database, or mishearing the name in some form. Think "Ashley Antiques" vs. "Ashlee Antiques" vs. "Ashleigh Antiques" or "Rate My Reading" turns into "Great My Treating". This is a purposefully very open-ended and tricky problem, and you are not expected to get all cases. One example is viewable and most are purposefully hidden - try to be creative with your solution, as there are multiple ways you could solve this piece! Test cases 9 through 16 are the ones that relate to this part of the problem; as before, an example is below.

Input

3

Pare Computers

Conduct An Ersis

Palintary Technawlogies

3

Pear Computers;54-1264938

Construct An Ursus;58-1481332

Planetary Technologies;19-3563561

Output

["54-1264938", "58-1481332", "19-3563561"]

Homework Answers

Answer #1
import difflib

def matchNames(inputNames,secRecords):
    my_dict = {}
    possibilities = []
    output = []
    for val in secRecords:
        #used to seperate secrecords
        data = val.split(";")
        my_dict[data[0]] = data[1]
        possibilities.append(data[0])
    for name in inputNames:
        output.append(my_dict[difflib.get_close_matches(name, possibilities)[0]])
    print(output)


#for just checking the working of function
inputNames = ["Pare Computers","Conduct An Ersis","Palintary Technawlogies"]
secRecords = ["Pear Computers;54-1264938",
              "Construct An Ursus;58-1481332",
              "Planetary Technologies;19-3563561"]
matchNames(inputNames,secRecords)

I have used difflib library to do the work.

difflib.get_close_matches(name, possibilities) takes names and possible outcomes and give list matching the outcomes.

Know the answer?
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for?
Ask your own homework help question
Similar Questions
ADVERTISEMENT
Need Online Homework Help?

Get Answers For Free
Most questions answered within 1 hours.

Ask a Question
ADVERTISEMENT