Documentation Release Notes Downloads FAQs

Sovren AI Matching/Searching

The Sovren AI Matching Engine (AIM) provides the ability to index, search, and perform high-quality profile-based matching of resumes and job descriptions. Our cloud-based matching platform provides a scalable solution to finding a needle in the haystack without the need for countless hours of reviewing resumes/jobs.

Indexing

In order to be able to search and match documents you must first add them to an index. Indexes are collections of documents (either jobs or resumes) and can be used as logical separations for your documents. You can search/match over one or more indexes in the same request.

Populating an Index

In the Sovren AI Matching Engine, an index is a collection of documents of the same type, either Resumes or Jobs. You create one or more indexes, you add/update/delete documents in those indexes, and you search/match within those indexes. Each index is an inverted full-text index that provides fast searching of terms within the text of the documents as well as the semantic data, without needing to scan individual files.

It's recommended to separate indexes into logical groups based upon typical searching/matching traffic. For example, if your company has multiple departments and generally don't recruit across multiple departments, it's a good idea to separate the departments into separate indexes (this is also appropriate for multi-tenant scenarios). This makes searching/matching more performant because the engine does less work to identify the results set (this can also be achieved using custom value ids). When data is split across multiple indexes you can still search/match over multiple indexes in a single transaction. If your company's recruiting needs are more general, it would make the most sense to simply have a single index for resumes and a single index for jobs.

Our engine is built on a near real-time full text search engine, so typically within 1 second of adding a document to an index it will be searchable from the API. We recommend that you build the indexing of new documents directly into your application's workflow. For example, if you need to geocode location coordinates, the workflow would be: 1) Parse, 2) Geocode, 3) Index. In scenarios where you have more than one document to add to an index make sure to use the bulk index api endpoint for the most performant response.

Matching

Matching allows you to simply supply a Job, Resume, or some Criteria and tell the engine to find the best matching documents.

Ideally, you’d have so many great matching resumes or jobs that those are all you’d see, but anyone that’s done recruiting knows that there are at best a small number of great matches followed by a huge number of partial or weak matches. In some cases, depending on the actual contents of your index, the best match may be a weak match. The job of the AI Matching Engine is to show you the best matches, ranked best to worst by absolute score, so that your users can spend their time more efficiently than wading through huge volumes of bad matches.

Since our matching engine excels at sorting and scoring large document sets, it's not recommended to ever return more than 100 documents from a transaction. This is more results than a human has time to look through, and by going that deep in the data set the results will be significantly worse. If you need to narrow the focus of a transaction use the filtering layer to restrict the document set to a smaller subset of the index.

Benefits:

  • End users do not need to fill out complicated search forms; they can simply select a Job/Resume profile from your system or upload a Job/Resume document.
  • This is the easiest of all types of searches to implement, and the most powerful.
  • Results are ranked and scored from perfect to partial matches, taking into account many domain-specific factors in what makes a good match, unlike SQL queries which cannot provide relevancy scores or Full Text queries that don’t understand the context.

There are three endpoints you can use to perform a match.

Searching

Searching leverages the same filtering layer as matching, but doesn't acts only as a boolean query. It returns the results set sorted with the newest results first. This endpoint exposes pagination to allow you to explore the first 1000 documents in 100 result pages. If you need to go deeper than 1000 results you should restructure your query and add a more restrictive filter.

For more details on how make a searching api call refer to the API Documentation.

Scores

Scores in matching are absolute whole numbers ranging from a minimum of 1 to a maximum of 100. A score of 100 is a perfect match and a score of 1 matched almost nothing. Our engine doesn't use flawed density calculations or other common full-text scoring algorithms. We developed our own scoring calculation that evaluates two documents as a human would. Our matching returns those scores broken down by category with suggested weights for each category. This allows us to give suggested scores without removing users' ability to influence the calculation.

Categories

Category Explanation
Certifications List of certifications required or attained.
Education Highest degree level required or attained.
Executive Type If an executive, then the type of executive (such as Executive, Operations, Financial, IT).
Job Titles Exact and partial match position titles.
Languages Foreign languages required or attained.
Management Level The level of management required or attained, from low-level leadership and supervisory positions, to mid-level managers and directors, to high-level VPs, to C-level executives.
Skills List of skills required or attained.
Industries Best Fit Taxonomy (industry) as calculated by an analysis of skills to fit a resume or job into categories such as "Information Technology → Programming" or "Finance → Accounting".

Custom Values

The Sovren AI Matching Engine supports adding custom values that can be associated with resumes and jobs and that can be used for filtering the result set. Custom values are implemented by encoding values as strings and passing those into the index document api call. The recommended string format for a custom Id value is: Prefix + Field + Value:

  • Prefix is a unique prefix that distinguishes a value from any other source or custom field. For example, "CFZ".
  • Field is some string to identify the field or property. For example, "Status".
  • Value is the value of the field encoded as a string. For example, "Active" or "123".
  • The entire Id should be alphanumeric without any whitespace, symbols or punctuation (i.e. "abc123" instead of "abc-123") so that there will not be any issues with how these values are tokenized by the indexing engine.

Combining these strings together results in custom Id values like:

  • CFZHot
  • CFZStatusActive
  • CFZRegion123
  • CFZAvailable20150116

Once documents with those values are indexes you can filter the data set by those custom ids by setting the FilterCriteria.CustomValueIds property.

{  
  "FilterCriteria": {
    "CustomValueIds": ["CFZHot", "CFZStatusActive", "CFZRegion123", "CFZAvailable20150116"]
  }
}

Filtering

The engine supports limiting result sets to a range of records specified by a filter. It is important to understand that filters work like this: (query) AND (filter). In order for the search to return any records, the query and filter portions must both match some of the same records – in other words, the two results sets are intersected. If you expect results but are not getting any, then run your query without the filter to see if it is returning any records, and also check to see if any of those results will also be matched by your filter. The query portion is all of your normal search criteria. The query portion is scored but the filter portion is not. So, if you have criteria that need to be part of your search but that should not affect the score, then move them to the filter portion. The filter portion is high-performance and limits the result set without affecting the score. Because the filter performs better than the query, you should consider moving as much of your search criteria to the filter as possible. Filters can be specified on both search and match api calls.

Semantic Filter

Filter a result set using an object based representation of your query. These are the easiest to build, but don't have the same level of boolean flexibility as a full text filter. All of the property queries are joined together by AND, but the terms in each property are joined together by OR. Some examples are provided below, but for full explanation of the FilterCriteria object and all of the available properties refer to the Search and Match endpoints. For example, to filter the result set to documents containing document id 1 or document id 2 AND employer Google AND (skill java or skill c#) use:

{  
  "FilterCriteria": {
    "DocumentId": [ "1", "2" ],
    "Employers": ["Google"],
    "Skills": [
      {
        "SkillName": "java",
      },
      {
        "SkillName": "c#",
      }
    ]
  }
}
Location Filtering

Our engine allows you to filter searches and matches based on an exact location, or a radius from that location. For example, we can filter to an exact match to Dallas, Texas, or say 25 miles from Dallas, Texas. When using a distance our API will call out to Google to get the geo coordinates of the address you specify. Just like the Geocode endpoint you can specify your account Google or Bing account, or you can pass the latitude and longitude directly into the search or match call. Geocoding in a search or match follows the same cost structure as the Geocode endpoint and is documented here.

Filter an exact address

{  
  "FilterCriteria": {
    "LocationCriteria": {
      "Locations": [
        {
          "CountryCode": "US",
          "Region": "Texas",
          "Municipality": "Dallas"
        }
      ]
    }
  }
}

Filter 25 miles from an exact address using the built in Google account

{  
  "FilterCriteria": {
    "LocationCriteria": {
      "Locations": [
        {
          "CountryCode": "US",
          "Region": "Texas",
          "Municipality": "Dallas"
        }
      ],
      "Distance": 25,
      "DistanceUnit": "Miles"
    }
  }
}

Filter 25 miles from an exact address using predefined latitude and longitude

{  
  "FilterCriteria": {
    "LocationCriteria": {
      "Locations": [
        {
          "GeoPoint": {
            "Latitude": 32.780115,
            "Longitude": -96.7851757
          },
        }
      ],
      "Distance": 25,
      "DistanceUnit": "Miles"
    }
  }
}

Full-text Filter

Filter a result set using a custom query expression. For example to filter the result set to documents containing the word Elasticsearch and currently have the skill c# use:

{  
  "FilterCriteria": {
    "SearchExpression": "skill:(c#) AND Elasticsearch"
  }
}

This is just a simple example of what full-text filtering can accomplish. This field supports standard full-text searches, semantic searches, or a combination of both types of searches

Boolean Syntax

A Boolean search request consists of a group of words or phrases linked by connectors such as AND, OR, NOT that indicate the relationship between them. Note:

The connector words must be uppercase.
Expression Explanation
apple AND pear Both words must be present
apple OR pear Either word can be present
apple AND NOT pear Only apple must be present

If you use more than one connector, you should use parentheses to indicate precisely what you want to search for. For example, apple AND pear OR orange juice could mean (apple AND pear) or orange, or it could mean apple AND (pear OR orange).

Search terms may include the following special characters:

Character Explanation
( ) Parentheses for precedence and grouping
? Matches any single character. Example: appl? matches apply or apple.
* Matches any number of characters. Example: appl* matches application
~ Fuzzy search. Example: ba~nana matches banana, bananna.
~~ Range query. Example: 12~~24 matches 18.

Semantic Expressions

Search expressions support Semantic Clauses, which are Sovren extensions to the underlying search engine syntax that can be placed anywhere within the Boolean expression. Semantic Clauses take the following form: type:(term; parameter1=value; parameter2=value; ...)

Each parameter is separated by semicolons. If a term or parameter value contains an equal sign, semicolon or parentheses, then it must be surrounded by double-quotes or those characters must be escaped by a backslash character. When inside double-quotes, only double-quote characters must be escaped by a backslash. If the values come from user input, then make sure that you escape those values to prevent search syntax errors or unexpected results.

The following semantic clauses are supported:

Custom Value

An alphanumeric token in one of the Id values injected into Resumes or Jobs. These are often used for filtering or partitioning the data within an index by statuses or other custom fields. Syntax id:(term)

Skill

A skill term. Syntax: skill:(term;parameters)

Optional Parameters
Term Explanation
minMonths Minimum number of months of experience with this skill
maxMonths Maximum number of months of experience with this skill
monthsAgo Limit results to skills held within this number of months before the RevisionDate
Examples
Term Explanation
skill:(java) Filter the skill java
skill:(java;minMonths=12) Filter the skill java with at least 12 months of experience
skill:(java;maxMonths=12) Filter the skill java with no more than 12 months of experience
skill:(java;monthsAgo=0) Filter documents that have the skill java currently
skill:(java;minMonths=12;maxMonths=24;monthsAgo=0) Filter documents that have the skill java currently and 1-2 years experience

Certification/License

A certification or license term. Syntax certification:(term) or license:(term)

Job Title

A position title in the candidate’s employment history or describing a job. Syntax title:(term;parameters)

Optional Parameters
Term Explanation
monthsAgo Limit results to job titles held within this number of months before the RevisionDate
includeVariations Determines whether or not to include variations of the original job title (For example, Developer instead of Web Developer - defaults to true)
Examples
Term Explanation
title:(Web Developer) Filter the job title Web Developer
skill:(java;monthsAgo=0) Filter documents that have the job title Web Developer currently
skill:(java;includeVariations=false) Filter documents that have the exact job title Web Developer

Employer

An employer/organization name in the candidate's employment history or describing a job. Syntax employer:(term;parameters)

Optional Parameters
Term Explanation
monthsAgo Limit results to employers held within this number of months before the RevisionDate
Examples
Term Explanation
employer:(Google) Filter the employer Google
employer:(Google;monthsAgo=0) Filter for current employer Google

Executive Type

The type of executive experience the candidate must have. Term must be one of the following list of supported executive types: NONE, EXECUTIVE, ADMIN, ACCOUNTING, OPERATIONS, FINANCIAL, MARKETING, BUSINESS_DEV, IT, GENERAL, LEARNING. Syntax executiveType:(term)

Current Management Level

The management level that a job requires or that a candidate has in the most recent position. Term must be one of the following list of supported management levels: None, Low, Mid, High. Syntax currentManagementLevel:(term)

Author

When true, resume must have at least one publication. When false, resume must not have any publications. This clause is only valid when resumes were parsed with PublicationHistory coverage enabled. See ParserSettings.Coverage.PublicationHistory setting in the parser configuration string. Syntax isAuthor:(term)

Public Speaker

When true, resume must have at least one speaking event. When false, resume must not have any speaking events. This clause is only valid when resumes were parsed with SpeakingEventsHistory coverage enabled. See ParserSettings.Coverage.SpeakingEventsHistory or the SkipSpeakingEventsParsing setting in the parser configuration string. Syntax isPublicSpeaker:(term)

Has Military History

When true, resume must have some military history. When false, resume must not have any military history. This clause is only valid when resumes were parsed with MilitaryHistory coverage enabled. See ParserSettings.Coverage.MilitaryHistory or the SkipMilitaryHistoryParsing setting in the parser configuration string. Syntax isMilitary:(term)

Has Been Self Employed

When true, resume must have some self-employment history. When false, resume must not have any self-employment history. Syntax hasBeenSelfEmployed:(term)

Has Patents

When true, resume must have at least one patent. When false, resume must not have any patents. This clause is only valid when resumes were parsed with PatentHistory coverage enabled. See ParserSettings.Coverage.PatentHistory or the SkipPatentsParsing setting in the parser configuration string. Syntax hasPatents:(term)

Has Security Credentials

When true, resume must have at least one security credential. When false, resume must not have any security credentials. This clause is only valid when resumes were parsed with SecurityCredentials coverage enabled. See ParserSettings.Coverage.SecurityCredentials or the SkipSecurityCredentialsParsing setting in the parser configuration string. Syntax hasSecurityCredentials:(term)

Security Credential

A security credential name. This clause is only valid when resumes were parsed with SecurityCredentials coverage enabled. See ParserSettings.Coverage.SecurityCredentials or the SkipSecurityCredentialsParsing setting in the parser configuration string. Syntax securityCredential:(term)

Education

A grouping of related education values. Syntax education:(parameters).

At least one of the following parameters must be specified:
Term Explanation
schoolName Either an exact match or a normalized version of a school name.
degreeName An exact match of a degree name.
degreeType A specific type of degree that a resume has or that a job requires.
minimumDegreeLevel A specific type of degree that is the minimum that a resume must have or that a job requires.
minimumGPA A normalized GPA value from 0.0 to 1.0, with 1.0 being the top mark. For example, 3.5 on a scale of 4.0 would have a value of 0.875.
degreeMajor An exact match of a degree major.

Sovren Match/Search UI

The Sovren Matching UI provides the ability for clients to leverage a pre-built user interface for matching and searching. The interface includes forms for users to build custom match/search queries as well as a powerful query results screen. A client application can incorporate this generated UI into an iframe within an existing page, or as a separate page in the application. The generated web page contains all the necessary functionality to perform searching/matching; additionally, you can customize how the UI is generated to some extent to better suit your application.

Authentication

Similar to the normal match/search SaaS requests, the Matching UI is authenticated by including you service key in the request headers. Once a specific session is generated, that url is valid to authenticate once. A client application should redirect a user to the session immediately after generating the session for that user. The Matching UI handles authentication using bearer tokens which are stored in the client's browser. By generating the UI for a specific user and redirecting the user to the session, the UI knows which user to generate an auth token for. The url is only secure until it is visible to a user, so you should ensure your application automatically makes the request on the user's behalf (i.e. don't make users copy/paste the link). You should not worry about users sharing links; however, as subsequent requests to the url will require unauthenticated users to login.

Be sure to never reveal your service key to your end users in javascript or otherwise. It is perfectly fine for a user to know their auth token; however, as it is encrypted and is analogous to a temporary password for that user in the Sovren Matching UI. Auth tokens should not be shared between users. Ideally, your users would never see the auth token, as it is only stored in the user's browser. Add/remove/modify users in the Sovren Customer Portal.

UI Customization

The Matching UI can be customized by adding your company logo to the top banner. To manage your logo/theme options, visit the Sovren Customer Portal. You can also customize which features are included in the generated UI. See the UIOptions request parameter here for more info.

External Sourcing

The Matching UI can show match/search results that fit your criteria from external sources such as job boards and using our custom web searching algorithms. The Matching UI uses your credentials, but aggregates all the results in one, simple, location so you can compare scores across a variety of different sources. If you see a result you are interested in, you can add it directly to your ATS and index the full resume. See the Hooks section below for more details.

In order to see job board results in the Matching UI, you must first add credentials for each job board using the User Management page in the Sovren Customer Portal.

Hooks

In order to make the Matching UI more useful in your client application, we have added the ability for users to take actions on documents that are found in search/match results. There are two different types of Hooks - user action hooks, and sourcing hooks. User action hooks make a GET request to a URL of your specification which can include the id of the document and the index in which the document lives. Sourcing hooks make a POST request to a URL of your specification, sending the source, document, and user so it can be added to your ATS.

User Action Hooks

As an example of a user action hook, you may want to have a 'Hire' action in your ATS. You could set up an endpoint in your ATS such as https://my-great-ats.com/hire/{resumeId}. You would then add a Hook called 'Hire' with the URL https://my-great-ats.com/hire/{id} in the Sovren Customer Portal.

The 'Hire' action will then be possible for any result document shown in the Matching UI.

When a user clicks that action, a GET request is executed to the defined URL, in a new tab, replacing {id} with the document id and {index} with the index id.

Sourcing Hooks

A common use for sourcing hooks is to add documents from external sources, such as job boards, to your ATS. You can add the URL for the sourcing hook in the Sovren Customer Portal

The Matching UI sends a POST request to your endpoint with the name of the source such as the job board or custom web sourcing, the id of the document from that source, the full parsed document, the source document text, and the email addrress of the user making the request.

{  
    "DocumentSource": "",
    "DocumentSourceId": "",
    "ParsedDocument": "",
    "SourceDocument": "",
    "UserEmail": ""
}

The Matching UI expects a response containing the name of the desired index to add the document to and a unique id to assign to that document.

{  
    "IndexName": "",
    "DocumentId": ""
}