API FAQs Downloads

Our software consists of two parts - the engine and the API. Release notes for new features, improvements, and bug fixes will show under the Engine Release Notes. Release notes for updates to the API input and output structure will show under the API Release Notes.

API Release Notes

Version 10.0.0

December 15, 2020

Initial Release

Introducing Version 10 of the Sovren API. Both V9 and V10 use the same parsing and matching engines under-the-hood, but V10 is more streamlined and has a vastly simpler output. Please visit this link for an in-depth comparison.

Engine Release Notes

Version 9.6.1

March 25, 2021

Improvements

Resume & Job Order Parser

Much better parsing for two-column resumes, including resumes that are only partially two-column.

Improved detection/parsing of date ranges that are split into a vertical multi-line section within an otherwise horizontal text section.

Better contact info parsing. We now report the best-formatted phone number when multiple versions of the same number are present in the resume. We no longer report fax numbers because it is 2021, not 1991. We no longer report work phone numbers when there are several other non-work phone numbers.

Improved recognition of stacked person names (names broken into words on separate lines).

We no longer report 'Politics', which was only active for Chinese resumes because we feel it is immoral.

Better education parsing. This is the most significant advance in education parsing in several years. Fewer false cognate school entries, better degree names, and much better major names. Also all certificates (except in certain locales such as Australia) have been moved to the Certifications section since they are really not the same as degreed education.

Far better recognition/handling of job nesting (or not) by company name and dates.

Improved accuracy on all aspects of employment parsing: company name, job title, location, and dates.

Improved parsing for LinkedIn resumes, especially for some non-English languages.

Improved parsing for Indeed's new resume format, which include an Assessments section.

Skills now generally ouput as Proper-case rather than UPPERCASE.

Improved accuracy across all data types in all languages, but especially so in English, German, Italian, Dutch, French, Russian, Hungarian, Czech, and Swedish.

Document Converter

Improved PDF Conversion.

Upgraded 3rd party software.

Bug Fixes

Resume & Job Order Parser

Fixed various reported bugs.

Document Converter

Fixed a bug that was causing some blank HTML to be output on some corrupted/no-text documents

Version 9.5.2

December 15, 2020

New Features

AI Matching

Added filters for Average Months Per Employer and Job Predictive Index. These filters are only available on documents indexed after this release date. Please reach out to support@sovren.com for more information.

Improvements

Document Converter

Improved HTML Conversion.

Bug Fixes

AI Matching

Minor runtime bug fixes.

Version 9.5.1

October 9, 2020

New Features

Resume & Job Order Parser

Added initial support for Indonesian culture.

Matching UI

More customizable User Action Hooks for Sourcing. For more info, see here.

Show results from Job Boards by default. For more info, see here.

Improvements

Resume Parser

More accurate Contact Info, Education and Employment parsing.

Speed improvements.

Minor bug fixes.

Document Converter

Improvements to selection of correct conversion strategy.

Version 9.5

September 3, 2020

New Features

AI Matching

Scoring and Default Sorting

We have developed a new score (SovScore) that is derived from the combination of the WeightedScore and the Reverse Compatibility Score (RCS). Blending these two scores together using a proprietary algorithm allows us to provide a single score that blends results from both directional perspectives of the match. This new SovScore results in better matches rising to the top of the result set, while matches that score highly only from a single directional perspective will be scored lower.

All match results returned by the API, and in Sovren’s own AI Matching UI, are sorted in descending order by the SovScore property. Please note that you are always free to use your own custom sorting or postprocessing of the returned results before displaying them in your own UI.

Transparency

In an effort to continually increase the industry-leading transparency of our AI Matching engine, we have added new metadata to the match response. We added an EnrichedData property for each direction of the match. This object contains (a) the terms that were found, (b) the terms that were not found, and (c) explanatory messages (in English) about each data point. For example, in the skills object you might see that the skill 'java' was found, and in the corresponding explanation it might state that while 'java' was found in the document as a skill, it wasn't used by the candidate recently.

Improvements

Resume & Job Order Parser

Improved coverage for certain CAD/CAM skills.

Version 9.4.10

August 26, 2020

Improvements

Resume & Job Order Parser

Better nesting of jobs based on employer names.

Better phone number parsing (eliminated some false cognates).

Better work history parsing in Dutch, Russian, Italian, French, and Greek.

Improved coverage of the cloud computing sub taxonomy.

Document Converter

Can convert more documents to HTML, whereas some conversions previously failed.

Bug Fixes

Resume Parser

Minor runtime bug fixes.

Version 9.4.9

August 6, 2020

Improvements

Resume & Job Order Parser

Improved rejection of false positive addresses and phone numbers.

Substantial improvements to skills taxonomy weights (expressed in output as percentages).

Better parsing for Dutch, Hungarian, and Croation languages.

Better accuracy in all sections.

9% faster.

Document Converter

Better conversion to HTML and PDF.

Some files that could not be converted to HTML or PDF because of internal errors in the document can now be converted to HTML or PDF.

Version 9.4.8

June 23, 2020

Improvements

Resume & Job Order Parser

Remove some spurious HTML tags in plain text before parsing.

Better certifications parsing.

Better location parsing in all languages.

More accurate education parsing in all languages.

Bug Fixes

Resume Parser

Minor runtime bug fixes.

Version 9.4.7

May 29, 2020

Bug Fixes

Job Order Parser

Fixed a language identification bug when parsing non-English jobs.

Version 9.4.6

May 22, 2020

Bug Fixes

Document Converter

Fixed a rare HTML conversion issue.

AI Matching

Fixed various request validation bugs for malformed requests.

Version 9.4.5

May 13, 2020

Improvements

Resume Parser

Improved LinkedIn profile parsing.

Sourcing

Added Monster as a Sourcing option. Contact sales@sovren.com for more information.

Version 9.4.3

May 1, 2020

Improvements

Resume Parser

Better Hungarian parsing.

Better parsing of certifications.

Better parsing of educational majors.

Fewer false positives on educational GPAs.

Generally more accurate.

Fixed some minor runtime bugs.

AI Matching

Improved the handling of punctuation in full-text searching.

Bug Fixes

AI Matching

Fixed edge case for handling bad input to CategoryWeights.

Version 9.4.2

March 17, 2020

Improvements

Resume Parser

Data improvements.

Document Converter

Improved conversion of PDF to HTML.

More accurate detection of LinkedIn PDFs.

AI Matching

We deprecated FilterCriteria.Skills[i].MonthsOfExperienceRange in favor or a more simple approach with FilterCriteria.Skills[i].ExperienceLevel. This change is backward compatible and all MonthsOfExperienceRange input will be mapped to the respective ExperienceLevel. Details on the new property can be found in the Request Body documentation for the following endpoints:

Bug Fixes

Resume Parser

Fixed two runtime bugs.

Version 9.4.1

March 11, 2020

Bug Fixes

Resume Parser

Fixed several runtime bugs.

Job Order Parser

Additional runtime bug fixes.

Version 9.4.0

March 4, 2020

Improvements

Resume Parser

Better LinkedIn parsing.

World's fastest resume parsing is now substantially faster.

Bug Fixes

Resume Parser

Fixed several runtime bugs.

Version 9.3.10

February 14, 2020

Improvements

Resume Parser

Better LinkedIn parsing.

Better email parsing.

Better Chinese parsing.

Improved job title parsing for Bulgarian, Estonian, Finnish, Croatian, Hungarian, Lithuanian, Latvian, Polish, Romanian, Slovak, and Slovenian.

Improved education parsing for Indonesian school names.

Improved handling of company names which are actually URLs.

Improved handling of invalid dates such as February 29, 2007 (2007 was not a leap year).

Added a new configuration option to output all Metadata in English regardless of the resume language. Set "OutputFormat.AllSummariesInEnglish = true;" in the configuration string.

Job Parser

Improved parsing for years of experience requirements.

Document Converter

Better PDF Conversions.

Bug Fixes

Resume Parser

Fixed several runtime bugs.

Fixed bug that caused some jobs to be duplicated when reported on some German CVs.

Version 9.3.9

November 21, 2019

Improvements

Resume Parser

Better Work History parsing.

Improved LinkedIn profile parsing.

Document Converter

Performance optimizations.

Version 9.3.8

November 19, 2019

Improvements

SaaS Services

System performance optimization.

Version 9.3.7

November 8, 2019

Improvements

Resume Parser

Better Swedish work history parsing.

Better parsing for Educational History in all languages.

Improved skills taxonomy.

Improved LinkedIn profile parsing for location and street level address information.

Improved LinkedIn profile URL parsing.

Improved Candidate Name suffix parsing.

Improved recognition of some obscure date formats, primarily for European resumes in a language other than English.

Improved Chinese language parsing accuracy.

Improved accuracy of school names and degrees.

Document Converter

Improved performance for converting documents to HTML.

Reduced timeouts on Excel documents.

Improved LinkedIn profile text conversion for the latest version of LinkedIn profiles.

SaaS Services

Improved performance for all SaaS Service endpoints.

Bug Fixes

Resume Parser

Fixed an obscure JSON output inconsistency. The following fields would sometimes output as a string, instead of an object:

  • DOB
  • Gender
  • Nationality
  • Marital Status
  • Required Salary
  • Current Salary
This issue is now corrected so that these fields always output as a JSON object, not a string. If you use these fields, it is important to check your programming to ensure you are expecting these fields as an object, not a string.
For example, the following output would occur if the personal information was not inferred, or if there was not explicit currency found:
"UserArea": {
  "sov:ResumeUserArea": {
    "sov:PersonalInformation": {
      "sov:Nationality": "USA",
      "sov:Gender": "Female",
      "sov:MaritalStatus": "Married",
      "sov:CurrentSalary": "250000.00",
      "sov:RequiredSalary": "290000.00"
    }
  }
}
This will be corrected so that these fields will always output as an object, such as:
"UserArea": {
  "sov:ResumeUserArea": {
    "sov:PersonalInformation": {
      "sov:Nationality": {
        "@inferred": "false",
        "#text": "USA"
      },
      "sov:Gender": {
        "@inferred": "false",
        "#text": "Female"
      },
      "sov:MaritalStatus": {
        "@inferred": "false",
        "#text": "Married"
      },
      "sov:CurrentSalary": {
        "@currency": "",
        "#text": "250000.00"
      },
      "sov:RequiredSalary": {
        "@currency": "",
        "#text": "290000.00"
       }
    }
  }
}

Version 9.3.6

September 30, 2019

Improvements

Resume Parser

Improved LinkedIn profile parsing.

Added new skills.

Improved Employment parsing for Swedish resumes.

Bug Fixes

Resume Parser

Fixed a runtime bug.

Version 9.3.5

September 26, 2019

Improvements

Resume Parser

Added parsing support for UK academic qualifications such as GCSE, BTEC, NVQ, and DipCG.

AI Matching

Enhanced the robustness of the matching engine.

Added support for non-integer taxonomy ids for Match by Criteria requests.

Version 9.3.4

September 12, 2019

Improvements

Resume Parser

Better logic for two-character skills to prevent reporting erroneous data as a skill.

More accurate Company Name parsing.

Better Italian Employment History parsing.

More accurate education parsing.

Up to 20x faster on some resumes, and about 9% faster overall.

Bug Fixes

Resume Parser

Fixed several runtime bugs that were tested to occur about once every million documents.

New Features

Matching UI

New skills autocomplete functionality to vastly improve the accuracy and efficiency of users searching for specific skills.

Version 9.3.3

August 28, 2019

Improvements

Resume Parser

Improved parsing of LinkedIn profiles in Spanish.

Improved the Experience Summary output when no dominant taxonomy was found.

Improved phone number parsing.

Document Converter

Improved detection and conversion of reversed text in poorly constructed PDF documents.

AI Matching

Diacritics (e.g. é) can now be used in index names and document IDs.

Version 9.3.2

August 14, 2019

Improvements

Resume Parser

Major improvements to Employment parsing by changing the logic used for nesting of jobs by date ranges. The logic was improved all around, meaning that more jobs that should be nested do get nested, and more jobs that should not be nested don't get nested.

More accurate LinkedIn parsing.

Bug Fixes

Resume Parser

Fixed several runtime bugs that were tested to occur about once every million documents.

Document Converter

Fixed a condition that would report a file as corrupt rather than having too-short-to-be-believed lines.

SaaS Services

Fixed a bug where having whitespace (such as a tab character) in the AccountId header could cause an error.

Version 9.3

July 29, 2019

New Features

Resume Parser

Added full support for the following languages/locales:

  • Bulgarian
  • Croatian
  • Estonian
  • Finnish
  • Hungarian (was previously partially supported)
  • Latvian
  • Lithuanian
  • Polish
  • Romanian
  • Slovak
  • Slovenian

Job Parser

Added full support for the following languages/locales:

  • Bulgarian
  • Croatian
  • Estonian
  • Finnish
  • Hungarian (was previously partially supported)
  • Latvian
  • Lithuanian
  • Polish
  • Romanian
  • Slovak
  • Slovenian

AI Matching

Added full support for the following languages/locales:

  • Bulgarian
  • Croatian
  • Estonian
  • Finnish
  • Hungarian (was previously partially supported)
  • Latvian
  • Lithuanian
  • Polish
  • Romanian
  • Slovak
  • Slovenian

Bug Fixes

AI Matching

Fixed a bug where setting MatchCriteria.MonthsManagementExperience = null would cause an error.

Version 9.2

July 18, 2019

New Features

Added a new endpoint to get the number of documents in an index, more information here.

Added a new endpoint to Geocode and Index a document, more information here.

Improvements

Resume Parser

Accuracy improvements in all languages.

More accurate contact info parsing. Better phone number type classification.

More accurate LinkedIn parsing.

More accurate Employment parsing.

More accurate Skills parsing with new terms.

More accurate Education parsing. Far more accurate degree types in every language.

Added support for Resume Quality codes 121-124 for resumes from UK and NZ

Better parsing of Colombian national identity numbers.

Approximately 3%-14% faster than previous release.

Job Parser

Parsing throughput has increased over 50%.

Document Converter

Better detection and correction of reversed text.

More useful conversion of LinkedIn pdf profiles in the varied two-column formats.

Faster conversion of PDF to HTML.

Better trimming of leading whitespace preceding first non-whitespace character in converted text.

When documents take too long to convert, we now return converted text in many more instances, whereas before we returned no text.

Improved messaging for possible conversion errors.

New Output Validity: ovLinesSeemTooShort.

Added messages for output validity warnings and errors.

Upgraded third party DLL versions:

  • Aspose Words 19.4
  • Aspose PDF 19.4
  • dtSearch 7.93.8596.18093

AI Matching

Added a restriction for Match by Criteria. To enable Match by Criteria on your account, please contact sales@sovren.com.

We made the boolean connectors (AND, OR, and NOT) in the FilterExpression.SearchExpression case insensitive. For example, you can now search for "java and c#" as opposed to having to specify "java AND c#".

We significantly improved normalization of school names behind the scenes, so that searches on school names will return more comprehensive results. We also stopped indexing high school names to reduces false positives.

Updated the logic for boolean search fields (IsAuthor, HasPatents, IsMilitary, IsPublicSpeaker, HasSecurityCredentials, IsMilitary) to only support filtering when passing true for those fields. Previously, passing a value of false for any of these fields would filter out results that had the specified criteria. Now, passing false or null for any of these fields will not filter the results any further.

We added wildcard support to the following fields on the FilterCriteria object in Matching and Searching (for more information refer to the documentation):

  • Skills
  • JobTitles
  • Certifications
  • SchoolNames
  • DegreeNames
  • Employers
  • SecurityCredentials

We added the following properties to the FilterCriteria object in Matching and Searching (for more information refer to the documentation):

  • CustomValueIdsMustAllExist
  • SchoolNames
  • DegreeNames
  • DegreeTypes
  • EmployersMustAllBeCurrentEmployer
  • SkillsMustAllExist
  • IsTopStudent
  • IsCurrentStudent
  • IsRecentGraduate
  • JobTitles
  • ExecutiveType
  • Certifications
  • MonthsManagementExperience
  • CurrentManagementLevel
  • LanguagesKnown
  • LanguagesKnownMustAllExist
  • Taxonomies

We added support for the following semantic clauses to be used in the FilterCriteria.SearchExpression (for more information refer to the documentation):

  • docid:()
  • taxonomy:()
  • schoolname:()
  • degreename:()
  • degreetype:()
  • minimumgpa:()
  • minimumdegreelevel:()
  • postalcode:()
  • municipality:()
  • region:()
  • country:()
  • location:()
  • language:()
  • doclanguage:()

All SaaS Services

Better error messages for invalid requests.

Matching UI

Updated the Filter and Match Criteria page layout for easier usability.

Added support for all fields in the Search API, including newly added fields, document metadata, and other uncommon filters.

Added the support for custom value picklists.

Improved validation messages.

Bug Fixes

Resume Parser

Improved non-English Sovren-generated candidate summaries.

Fixed bug that caused some jobs to be duplicated when reported on some German CVs.

Fixed an issue where skills and normalization data could be cached for longer than 24 hours.

Document Converter

Corrected situation where some non-binary data was being detected as binary data and being reported as ovProbableGarbageInText.

AI Matching

Fixed an issue applying an upper bound of years experience for a skill.

Fixed an issue filtering multiple document languages in the same transaction.

Fixed an issue where requests would fail when searching or matching across multiple indexes that have documents with the same document id in the results.

Fixed an issue where match results would sometimes be out of order when sorted by the score.

Version 9.1.2

November 13, 2018

Improvements

Resume Parser

7% faster.

Parses all LinkedIn past and present versions extremely accurately.

Better Swedish date parsing.

More accurate employment parsing.

More accurate education parsing.

Improved resume sectioning.

Document Converter

Better PDF conversions.

Version 9.1.1

November 8, 2018

Improvements

All SaaS Services

Better error messages for invalid requests.

Resume Parser

Fixed management level output for resumes with no current employment.

AI Matching

Improve Bimetric Scoring in cases where no second-best taxonomy is found.

Better comparison algorithm for job titles that contain prepositions.

Improved languages matching algorithm.

Version 9.1

October 20, 2018

Improvements

Resume Parser

Greatly improved parsing of gradepoint averages in Education.

Greatly reduced the number of spurious trailing work history jobs or educational schools.

Thousands of improvements to internal data lists.

Vastly improved LinkedIn parsing. We are now able to capture the hidden LinkedIn urls, and ignore the broken partial LinkedIn urls.

Degrees which are just certifications and not intended to be high school-or-higher degrees are now not output in Education, but rather, are output in Certifications.

Better parsing of school names. Fewer school names with City names hanging on the end (sometimes they need to be left that way; other times they need to be stripped – we do both better now).

Better parsing of Russian, Italian, and Norwegian schools and degrees.

Far more accurate nesting of PositionHistory nodes within EmployerOrg nodes: specifically, far fewer wrongful nesting events, and a few more correct nesting events.

We restored and improved the parsing accuracy for BOTH past and present LinkedIn resumes in all known formats.

Improved Company Name and Position Title accuracy by several percentage points. Improved the ability to distinguish between ambiguous elements.

Document Converter

Better removal of page numbers.

Vastly improved LinkedIn conversions. Conversion to single column format now happens in correct order. Page markers are properly removed. Broken lines are re-connected.

Have real formatted HTML output available from PDFs now.

Improved HTML-to-text conversions. HTML should not contain tabs except within <pre> tags, but some HTML wrongfully does. In the past, these tabs were converted to a single space; now, we convert them to multiple spaces. This ends up allowing the Parser to “see” many more section headers that in the past were invisible because they collided with nearby words.

LinkedIn URLs

We now report LinkedIn urls with the Use field as linkedIn. This provides the ability for a more programmatic way to extract this url.

<ContactMethod>
    <Use>linkedIn</Use>
    <Location>onPerson</Location>
    <WhenAvailable>anytime</WhenAvailable>
    <InternetWebAddress>https://www.linkedin.com/in/demo</InternetWebAddress>
</ContactMethod>

Bug Fixes

Fixed a bug in the ReservedData section output that would cause an error in scrubbing PII.

We were eliminating some valid URLs. We fixed that so that we now report more URLs.

Version 9.0.2

September 7, 2018

New Products

Sovren Apply

Provides a candidate portal for ingesting candidate resumes without the need to reinvent the wheel. Uses the latest Sovren Resume Parser and can integrate directly with the AI Matching Engine. For more information visit https://sovren.com/products/apply.

Sovren Sourcing

Quickly source candidates from 3rd party job boards and the web. The results leverage Sovren's Parsing and Scoring to add intelligence on top of the searching provided by those 3rd parties. This provides a quick way to evaluate candidates from multiple sources including your existing candidates. https://sovren.com/products/sourcing.

New Features

AI Matching

Added an endpoint to check if a document exists in an index. This can be used when trying to determine if a document has already been indexed and is much lighter weight than retrieving the entire document. REST API documentation.

Improvements

Better PDF conversions to fix some things intentionally broken by LinkedIn.

Version 9

May 18, 2018

Upgrade Path

If you are upgrading from version 8.0 or later, switching to version 9 is as simple as changing the url of the service from v8 to v9. No other changes needed, typically.

If you are upgrading from version 7.5 or earlier, this version isn't compatible with version 7.5. To upgrade to version 9.0, we recommend the following approach:

  1. If you're using a parser configuration string, regenerate your string in the new human-readable Name=Value pair format. Details on this new configuration string, and a conversion tool are documented here.
  2. Parse the Sample.doc file (as well as some of your own documents) in the current version you use, and with 9.0 using our Demo Application and save those results to disk.
  3. Use a document comparison tool to evaluate the differences, specifically the new fields. There is a lot of new metadata provided that could be of high value to integrate in your application. These new fields are detailed below in the New Features section. For a document comparison tool, we really like Beyond Compare.
  4. Remap your API calls to the new 9.0 methods as described in the API Documentation, make the desired changes to your implementation to leverage the new metadata, change the URL to point to version 9.0, and enjoy.

New Features

Added an endpoint to scrub the Personally Identifiable Information from a Resume/CV. More information can be found in the REST API documentation.

Improvements

Resume Parser

Improved the skills taxonomies for all languages. We added a new taxonomy/Subtaxonomy for all languages: "No dominant taxonomy → Not enough data". When we cannot determine the taxonomy with confidence because so few (or no) skills were found, we output "No dominant taxonomy → Not enough data".

Improved accuracy on Work History and Education.

Improved sectioning of resumes.

Overall accuracy is up about 3 absolute percentage points, with 99% of the previous speed. Sovren parsing speed is typically at least 5x faster than our nearest competitor’s speed, and we produce about 1/3 to 1/10 of the mistakes as our nearest competitor.

AI Matching Engine

Improved the handling of management level queries in Matching when there was no management level data in the source document.

Breaking Changes

Skills

We deprecated the SkillsStyle property because we now have a single canonical way and place to output skills.

Skills are now output only in the resume's UserArea, or job's SkillsTaxonomyOutput. The output is extremely easy to read and understand from both a human and programmatic standpoint. The output taxonomies are sorted in descending order of importance, and skills are alphabetical within the subtaxonomies, and child skills are nested within parent skills.

Also, importantly, we now use the English skills list for non-English skills parsing in addition to the detected language's built-in skills list. This will generally result in more skills being found, with very few false cognates.

DO NOT use/rely on the skill Ids that are output. We reserve the right to modify skill names and to preserve the skill Id when we do so. In some cases, we append a language code to skill Ids so that we can output them alongside another translation of that skill with the same Id. If you are relying on skill Ids, stop!

NOTE FOR CUSTOM SKILLS LISTS: When developing your custom skills lists, you must avoid using ANY Sovren taxonomy or skill Ids. The only way to be certain of that is to prepend or append an alphabetical character to your Ids if they are only integers.

Other

We deprecated the ParserSettings.OutputFormat.ReportAllCompanyNamesAndPositionTitlesRegardless and ParserSettings.OutputFormat.ContactMethod properties.

We made these properties read-only:

  • ParserSettings.OutputFormat.XmlFormat
  • ParserSettings.OutputFormat.MinimumCompanyNameProbability
  • ParserSettings.OutputFormat.MinimumPositionTitleProbability

We moved the Bimetric Score endpoint from /bimetricanalyzer to /scorer/bimetric.

Bug Fixes

Fixed an uncommon issue in our JSON output where some arrays were output as objects when they ony had a single item.

Fixed an issue where the OutputFormat.NormalizeRegions parser setting was being ignored.

Version 8.3

February 19, 2018

Breaking Changes

AI Matching Engine

Typlically minor releases don't contain breaking changes, but this is a unique case where our AI Matching service hasn't yet crossed into production for our clients and we wanted to take the changes to add in some important improvements. Moving forward, breaking changes will be reserved for major releases and will be deployed alongside the existing service at a new URL.

We made the following changes to our AI Matching Api:

  • We now strip all Personally identifiable information (PII) from the parsed document prior to storage in the index. We don't store this information anywhere in our platform, so the only way to identify a document is by the unique id that is specified to us at time of storage.

Resume Parser

Added ResumeQuality element to the Resume element’s UserArea. These changes are reflected in the SovrenResumeExtensions.xsd. If you use XML validation, this will cause your validation to fail unless you use the new XSD. NEVER EVER use XML schema validation in anything but a QA environment! Never use an intake process that will fail if new nodes appear in the output.

Restored SkillId to the skills output. We still heavily discourage integrators from depending on Sovren SkillId. We also eliminated some of the skills pruning implemented in earlier versions of 8.x, as these reductions to the output were reducing the effectiveness of searching and matching against the output.

New Features

AI Matching Engine

We added the following features

  • Added support for dashes and underscores in index names. This is helpful when needed to delimit information in the index name.
  • Added support for underscores in document names
  • Improved the Management Level query
  • Added configuration options to the POST /parser/resume endpoint to be able to Geocode from a specified address, or to insert specified latitude/longitude. These options were already a part of the POST /geocode endpoint, but can now all be done from a single api call.
  • Added configuration options to the /parser/resume endpoint to be able to specify custom ids for indexing. These options were already a part of the POST /index/{indexId}documents/{documentId} endpoint, but can now all be done from a single api call.

Resume Parser

We now calculate and output a Resume Quality summary along with related information about known or suspected problems with the resume. This data is now output into the ResumeQuality element to the Resume element’s UserArea. This score can help flag those resumes that are low quality, so that you can potentially ask the candidate to fix the identified problems. NOTE: never fix the XML. ALWAYS fix the unparsed resume and then re-parse it.

The ResumeQuality outputs details about known and suspected problems with the resume, in decreasing order of severity/importance. If the resume has no detected issues, it will output as follows:

<sov:ResumeQuality>
    <sov:Assessments>
        <sov:Assessment>
            <sov:Level>No Issues Found</sov:Level>
	</sov:Assessment>
    </sov:Assessments>
</sov:ResumeQuality>
Here is sample output that demonstrates the four problem levels:
<sov:ResumeQuality>
    <sov:Assessments>
        <sov:Assessment>
            <sov:Level>Fatal Problems Found</sov:Level>
            <sov:Findings>
                <sov:Information>We had to calculate where the work history section was. The section header was either missing, spanned multiple lines, or unknown. The work history section should have a header 'Work History' before listing the content.</sov:Information>
                <sov:Information>This resume is approximately 9 pages long, and appears to be a curriculum vitae. Such documents are prone to errors due to the use of nonstandard headers and the vast amount of data describing patents, speaking engagements, research, advisory roles, publications, etc. Accordingly, only the first WORK HISTORY section was parsed, as that usually results in far greater accuracy.</sov:Information>
                <sov:Information>Each of the following sections in the resume contain more content than the work history and education sections combined: 'Publikationen', 'Fremdsprache'. This usually indicates a major problem with section headers or formatting within the resume.</sov:Information>
	    </sov:Findings>
        </sov:Assessment>
        <sov:Assessment>
            <sov:Level>Major Issues Found</sov:Level>
            <sov:Findings>
                <sov:Information>The following section in the resume contains more content than recommended: 'Publikationen' This section should be less than 10 lines long as it can cause errors in parsing the resume.</sov:Information>
                <sov:Information>The following section types appear multiple times in the resume: LANGUAGES (2 occurrences), SKILLS (2 occurrences), HOBBIES (2 occurrences). Each section should only appear once in a resume.</sov:Information>
            </sov:Findings>
        </sov:Assessment>
        <sov:Assessment>
            <sov:Level>Data Missing</sov:Level>
            <sov:Findings>
                <sov:Information>The following work history position does not have a job title:  POS-1. Every position in a resume should have a job title.</sov:Information>
                <sov:Information>The following educational degrees do not have a degree name: DEG-2, DEG-5, DEG-4, DEG-6. Every degree in a resume should have a name or type associated with it, such as 'BS' or 'MS'.</sov:Information>
            </sov:Findings>
        </sov:Assessment>
        <sov:Assessment>
            <sov:Level>Suggested Improvements</sov:Level>
            <sov:Findings>
                <sov:Information>Skills section found in resume. Skills should not be in a separate section, but instead included in the descriptions of work history or education.</sov:Information>
                <sov:Information>The following work history position has a street level address included: POS-2. Including a street level address for anything other than the primary contact address can lead to unexpected results in the parser output and should be removed.</sov:Information>
            </sov:Findings>
        </sov:Assessment>
    </sov:Assessments>
</sov:ResumeQuality>

Bug Fixes

Resume Parser

Fixed a bug related to skills parsing that was omitted some skills.

Accuracy Improvements

Resume Parser

  • Much better Chinese person name parsing
  • Improved Resume Sectioning
  • Improved German language parsing
  • Improved Company Name parsing
  • Improved parsing of educational majors
  • Improved ability to detect and compensate for bad conversions where extra whitespace was inserted between words in sentences

Skills

Resume Parser

We restored the SkillId in the output of skills. We modified some skills and added new ones.

Version 8.2

December 26, 2017

Breaking Changes

AI Matching Engine

Typlically minor releases don't contain breaking changes, but this is a unique case where our AI Matching service hasn't yet crossed into production for our clients and we wanted to take the changes to add in some important improvements.

We made the following changes to our AI Matching Api:

  • Cleaned up matching/searching endpoints into Match by Document, Match by DocumentId, Match by Criteria, and Search
  • Moved the RevisionDateRange into FilterCriteria since it is restrictive and acts like a filter
  • Removed MaxRecords and added pagination to searching. We limit each page to 100 records, and allow you to query to the 10th page (1000 records).
  • Renamed MaxRecords to Take to follow the same naming convention as Searching
  • Renamed IndexIds to IndexIdsToSearchInto for Matching and Searching endpoints

Resume Parser

None.

New Features

AI Matching Engine

We added the following features

  • Added the Reverse Compatability Score to the AI Matching reponses
  • Added matched terms to the category scores array in AI Matching reponses
  • Synced the queries between Bimetric Matching and AI Matching
  • Added Category Weights as an input for AI Matching
  • Added an endpoint to match using a document that's already indexed

Resume Parser

Added Danish language parsing.

Bug Fixes

Resume Parser

Fixed a bug related to skills parsing that was omitted some skills.

Accuracy Improvements

Resume Parser

  • Improved resume sectioning
  • Improved Company Name parsing
  • Reduced false positives on some email addresses being reported for the candidate (but which did not actually belong to the candidate)
  • Improved accuracy on Norwegian-language resumes

Version 8.1

November 14, 2017

New Features

AI Matching Engine

Our new cloud-based matching platform provides a scalable solution to finding a needle in the haystack without the need for countless hours of reviewing resumes/jobs. Take a look at the documentation and the API to find out more.

Underlying Changes

We added several run-time bug fixes.

Version 8.0

September 26, 2017

Upgrade Path

Version 9 contains major improvements over version 8 (and the upgrade process is the same), we strongly recommend upgrading straight to version 9.

As discussed in the breaking changes section, this version isn't compatible with the prior SaaS versions. To upgrade to version 8.0, we recommend the following approach:

  1. If you're using a parser configuration string, regenerate your string in the new human-readable Name=Value pair format. Details on this new configuration string, and a conversion tool are documented here.
  2. Parse the Sample.doc file (as well as some of your own documents) in the current version you use, and with 8.0 using our Demo Application and save those results to disk.
  3. Use a document comparison tool to evaluate the differences, specifically the new fields. There is a lot of new metadata provided that could be of high value to integrate in your application. These new fields are detailed below in the New Features section. For a document comparison tool, we really like Beyond Compare.
  4. Remap your API calls to the new 8.0 methods as described in the API Documentation (REST | SOAP), make the desired changes to your implementation to leverage the new metadata, change the URL to point to version 8.0, and enjoy.

New Features

Metadata

We added some incredibly powerful metadata.

AverageMonthsPerEmployer - We now calculate the candidate's turnover rate; that is, how often they switch employers. Notice that we calculate this turnover per employer, not per job. In other words, if you have three jobs for one employer, stretching over 36 months, your average months per employer is 36, not 12.

FulltimeDirectHirePredictiveIndex - This is a mouthful, but it's an incredibly powerful statistic. Although it is on a scale of 0 to 100, that does not imply that low scores are bad and high scores are good. Think of it as a scale where low scores mean that you are best suited for part time jobs, projects, consulting, etc., and high scores mean you are best suited - and most likely to want - fulltime direct hire jobs. This is an incredibly powerful tool that can be of substantial value-added benefit to every recruiter.

AttentionNeeded - We now have a single area where we output items that need your attention when thinking about how/where to place a candidate. This is where we now place the following notices:

  • Warnings about the Job Objective. In the past, when we calculated that the candidate's job objective seemed like they were wanting to make a career change (different industry, job titles or management level), we noted that within the ExperienceSummary/Description node. Now, we have moved that notice to the AttentionNeeded node.
  • We added a new notice relating to management. If we detect that the candidate was previously in management, but their latest position was not in management, we call this to your attention.

Skills Alternate View

We replaced the old "Competencies" section with a more logical structure that is easier for users to consume. We also output some new data in this section, such as whether a skill is only reported as a parent or was actually found in the resume. View SkillsTaxonomyOutput Documentation for more details.

Normalization

We also added normalization of job titles as a standard feature and this data is always output. Refer to the PositionHistoryUserArea Documentation for more details.

Geocoding

We now supply geocoding through the new web API calling out to a third-party provider (Bing or Google). This geocoding is far more accurate than our old geocoding. The new geocoding is true address-level geocoding, whereas the old geocoding was postcode-level at best, and city-level in other cases, yet still used massive amounts of memory.

Twitter

We now output Twitter handles in the ContactInfo area as well as the ReservedData section of the Resume UserArea:

<ContactMethod>
    <Use>twitterHandle</Use>
    <Location>home</Location>
    <WhenAvailable>anytime</WhenAvailable>
    <InternetWebAddress>@twitQueen</InternetWebAddress>
</ContactMethod>

Breaking Changes

This build is NOT drop-in compatible with any previous build. We made many breaking changes that were necessary in order to simplify the API and eliminate obsolete properties and methods.

In an effort to clean up the existing API and prepare for XML and JSON output we removed the following API parameters (Note: the updated endpoint documentation can be found in the API Documentation (REST | SOAP):

ParseResume Endpoint

  • ParseResumeRequest
    • FileText
    • OutputXmlDoc
    • OutputWordXml
    • ParserVersion
    • OutputJson
  • ParseResumeResponse
    • XmlDoc
    • WordXml
    • WordXmlCode
    • ParserVersion
    • Xml
    • XmlCode
    • Json

ParseJobOrder Endpoint

  • ParseJobOrderRequest
    • FileText
    • OutputXmlDoc
    • OutputWordXml
    • ParserVersion
    • OutputJson
  • ParseJobOrderResponse
    • XmlDoc
    • WordXml
    • WordXmlCode
    • ParserVersion
    • Xml
    • XmlCode
    • Json

NormalizeResume Endpoint

  • NormalizeResumeRequest
    • Xml
  • NormalizeResumeResponse
    • Xml

We also redesigned our response codes and reduced the number of possible responses. Details can be found in the API Documentation (REST | SOAP).

For this release, we developed a much more readable and less error prone Name=Value pair configuration string. The legacy "01010..." Parser configuration string has been deprecated, so we highly recommend that you browse to the Config String Builder tool and use it to generate your new configuration string. You can copy and paste in your existing config string and the page will prepopulate your current settings on the form and generate the config string in the new format.

Underlying Changes

Accuracy Improvements

By popular demand, we changed how we split data between job titles and company names on LinkedIn profiles. That means that job titles will often be longer, and company names will often be shorter.

Contact info, employment, and education are even more accurate.

Speed

The Parser's performance as measured on our resumeparsing.com SaaS service using worldwide submissions in every language is approximately 500 milliseconds. If you are not seeing sub second average parse times, something is probably very wrong. Remember, ParseTime is always output in the Resume UserArea. Contact Sovren Support for help!

Skills

We deprecated the output of skills in the Resume.StructuredXMLResume.Qualifications.Competencies node. The HR-XML schema was simply not informative enough and not user-friendy. We also deprecated the output of BestFitTaxonomies. As mentioned above, both skills and best-fit taxonomies are now output in a hierarchical tree that is intuitive, informative, and easy-to-use. This output can be found in SkillsTaxonomyOutput.

We added dozens of new IT skills. We removed some skills that seemed to be of little value. We stopped some low-value skills from displaying in the output.

Prior to this release, several languages did not have built-in skills. That is no longer the case. We now have full skills trees in every language supported by the Parser.

Potentially breaking change: All custom skills taxonomies must now follow this pattern:

ParentTaxonomy => SubTaxonomy => Skill [ => Optional ChildSkill ]

In other words, you cannot have a skill tied to a taxonomy unless that taxonomy has a parent taxonomy.

Output

Choose JSON or XML

By popular demand: JSON!!! Through the new REST API you can now receive JSON that exactly mimics the HROpenStandards.org Resume 2.5 schema output.

Default settings

Please note that default settings have changed for date output formats, for skills output, for what sections are parsed (we now parse and output more sections by default).

Employment History

By popular demand, we replaced the numeric log-scaled CompanyNameProbability and PositionTitleProbability in the PositionHistoryUserArea with two new nodes called PositionTitleProbabilityInterpretation and CompanyNameProbabilityInterpretation. These new fields represent human readable interpretations of the CompanyNameProbability and PositionTitleProbability. For more information include sample output refer to the PostitionHistoryUserArea Documentation.

By popular demand, all Employment History output is now reordered into reverse chronological order (e.g., most recent data first, followed by progressively older data), regardless of the resume's ordering of the jobs. Please note that the UserArea for each PositionHistory node still shows you the resume-order for each job.

Education History

By popular demand, all Education History output is now reordered into reverse chronological order, regardless of the resume's ordering of the jobs. Please note that the UserArea for each PositionHistory node still shows you the resume-order for each school record.